Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis Paper • 2508.15754 • Published 4 days ago • 2
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published 4 days ago • 218
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published 4 days ago • 218
Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis Paper • 2508.15754 • Published 4 days ago • 2
Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis Paper • 2508.15754 • Published 4 days ago • 2 • 2
CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward Paper • 2508.03686 • Published 20 days ago • 33
CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward Paper • 2508.03686 • Published 20 days ago • 33
CompassJudger-2: Towards Generalist Judge Model via Verifiable Rewards Paper • 2507.09104 • Published Jul 12 • 17
Rethinking Verification for LLM Code Generation: From Generation to Testing Paper • 2507.06920 • Published Jul 9 • 28
Coding Triangle: How Does Large Language Model Understand Code? Paper • 2507.06138 • Published Jul 8 • 20
Coding Triangle: How Does Large Language Model Understand Code? Paper • 2507.06138 • Published Jul 8 • 20
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published Jun 16 • 263
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 177
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards Paper • 2505.24760 • Published May 30 • 69
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning Paper • 2502.06781 • Published Feb 10 • 60