16 Commits

Author SHA1 Message Date
PeterGriffinJin
7d6a15bfc5 remove unuseful file 2025-04-29 16:02:26 +00:00
PeterGriffinJin
a2870cb320 add multinode support 2025-04-10 12:26:43 +00:00
PeterGriffinJin
0b26e614f7 fix proto bug 2025-04-04 02:54:21 +00:00
PeterGriffinJin
7530318919 fix test dataloader shuffle bug 2025-04-02 22:23:11 +00:00
PeterGriffinJin
d874947732 fix turns_stats logging bug 2025-03-21 14:58:42 +00:00
PeterGriffinJin
83d10313be add action status 2025-03-19 22:19:27 +00:00
PeterGriffinJin
9ec2fa9892 fix grpo id bug 2025-03-19 18:59:19 +00:00
PeterGriffinJin
8c7f04ca45 response length include retrieval info 2025-03-19 00:36:21 +00:00
Bowen Jin
50cedb2c00 Merge pull request #21 from xiaobo-yang/yxb/fix-info-mask-bugs
Fix bugs related to loss mask, meta info, and response length
2025-03-18 19:33:50 -05:00
PeterGriffinJin
4b3c09451a fix kl loss issue 2025-03-18 20:07:47 +00:00
PeterGriffinJin
e85506f143 remove unnecessary codes 2025-03-17 16:08:33 +00:00
xiaobo-yang
32719b5119 Fix bugs related to loss mask, meta info, and response length
1. Construct the loss mask immediately after obtaining the observation to prevent encoding misalignment when converting back to tokens after text transformation.
2. Follow up on meta info to ensure that the test batch can apply do sample.
3. Remove the recording of info information for response length.
2025-03-14 14:25:40 +08:00
PeterGriffinJin
118c6e7361 fix reward bug 2025-03-13 19:18:56 +00:00
PeterGriffinJin
5fdfd52ac4 add logger 2025-03-01 04:28:07 +00:00
PeterGriffinJin
f0b4eef7d3 fix logging_utils bug 2025-03-01 04:08:28 +00:00
PeterGriffinJin
068516be64 Initial commit 2025-02-28 15:16:19 +00:00