3 Commits

Author SHA1 Message Date
PeterGriffinJin
ba78b68eb4 update train script 2025-04-09 19:31:20 +00:00
PeterGriffinJin
9ec2fa9892 fix grpo id bug 2025-03-19 18:59:19 +00:00
PeterGriffinJin
c4e0269cfc add grpo script 2025-03-12 15:16:03 +00:00