Collections of models and papers for works: "Reinforcement Learning for Reasoning in Large Language Models with One Training Example"
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 91 -
ypwang61/One-Shot-RLVR-Qwen2.5-Math-7B-pi1
Text Generation • Updated • 18 -
ypwang61/One-Shot-RLVR-Qwen2.5-Math-7B-pi1_pi13
Text Generation • Updated • 22 -
ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1
Text Generation • Updated • 19