Files
Search-R1/search_r1
xiaobo-yang 32719b5119 Fix bugs related to loss mask, meta info, and response length
1. Construct the loss mask immediately after obtaining the observation to prevent encoding misalignment when converting back to tokens after text transformation.
2. Follow up on meta info to ensure that the test batch can apply do sample.
3. Remove the recording of info information for response length.
2025-03-14 14:25:40 +08:00
..
2025-02-28 15:16:19 +00:00
2025-02-28 15:16:19 +00:00