Questions about data scale

#1
by masterLan - opened

How much data was used to train the final version of Qwen-2.5-MATH-PRM?

The final version of Qwen‑2.5‑Math‑PRM (the Qwen2.5‑Math‑7B‑PRM model) was trained on 3 million MC-estimation samples, which underwent a consensus filtering step that retained only about 40% of them. That leaves a final training set of approximately 1.2 million high-consensus samples

Sign up or log in to comment