Reinforcement Learning Foundations for Deep Research Systems: A Survey Paper • 2509.06733 • Published 6 days ago • 30
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26 • 56