Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme Paper • 2504.02587 • Published 5 days ago • 28
PaperBench: Evaluating AI's Ability to Replicate AI Research Paper • 2504.01848 • Published 6 days ago • 34
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Paper • 2503.24376 • Published 8 days ago • 35
Expanding RL with Verifiable Rewards Across Diverse Domains Paper • 2503.23829 • Published 9 days ago • 17
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models Paper • 2503.24235 • Published 8 days ago • 49
CLS-RL: Image Classification with Rule-Based Reinforcement Learning Paper • 2503.16188 • Published 19 days ago • 9
Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning Paper • 2503.16252 • Published 19 days ago • 27
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 21 days ago • 115
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization Paper • 2503.12937 • Published 22 days ago • 27
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning Paper • 2503.10291 • Published 26 days ago • 33
Self-Taught Self-Correction for Small Language Models Paper • 2503.08681 • Published 28 days ago • 13
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Paper • 2503.09516 • Published 27 days ago • 27
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond Paper • 2503.10460 • Published 26 days ago • 27
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning Paper • 2503.07572 • Published 29 days ago • 41
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL Paper • 2503.07536 • Published 29 days ago • 84
MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning Paper • 2503.07365 • Published 29 days ago • 56
Big-Math Collection This collection contains assets associated with the Big-Math dataset, a high-quality collection of over 250,000 math questions with verifiable answers • 3 items • Updated Mar 6 • 4