Questions about data scale
#1
by
masterLan
- opened
How much data was used to train the final version of Qwen-2.5-MATH-PRM?
The final version of Qwen‑2.5‑Math‑PRM (the Qwen2.5‑Math‑7B‑PRM model) was trained on 3 million MC-estimation samples, which underwent a consensus filtering step that retained only about 40% of them. That leaves a final training set of approximately 1.2 million high-consensus samples