Mol-R1: Towards Explicit Long-CoT Reasoning in Molecule Discovery Paper • 2508.08401 • Published 14 days ago • 39
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning Paper • 2504.13914 • Published Apr 10 • 4
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce Paper • 2504.11343 • Published Apr 15 • 19 • 6
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18 • 137