Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published 7 days ago • 72
MHPP: Exploring the Capabilities and Limitations of Language Models Beyond Basic Code Generation Paper • 2405.11430 • Published May 19, 2024 • 2
FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving Paper • 2502.20238 • Published Feb 27 • 24