view article Article Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models By AI-MO and 17 others • 7 days ago • 39
Spurious Rewards Collection Spurious Rewards: Rethinking Training Signals in RLVR • 14 items • Updated Jun 13 • 2
MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation Paper • 2505.17613 • Published May 23 • 8
Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models Paper • 2505.17015 • Published May 22 • 9
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models Paper • 2505.14810 • Published May 20 • 63
One-Shot RLVR Collection Collections of models and papers for works: "Reinforcement Learning for Reasoning in Large Language Models with One Training Example" • 14 items • Updated Jun 13 • 1
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published Apr 29 • 97
view article Article How NuminaMath Won the 1st AIMO Progress Prize By yfleureau and 7 others • Jul 11, 2024 • 122