AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play Paper • 2509.24193 • Published Sep 29, 2025 • 6
Position: The Hidden Costs and Measurement Gaps of Reinforcement Learning with Verifiable Rewards Paper • 2509.21882 • Published Sep 26, 2025
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs Paper • 2511.19773 • Published Nov 24, 2025 • 9
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs Paper • 2511.19773 • Published Nov 24, 2025 • 9
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs Paper • 2511.19773 • Published Nov 24, 2025 • 9 • 2
AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play Paper • 2509.24193 • Published Sep 29, 2025 • 6
The Invisible Leash: Why RLVR May Not Escape Its Origin Paper • 2507.14843 • Published Jul 20, 2025 • 85
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving Paper • 2507.06229 • Published Jul 8, 2025 • 75
Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration Paper • 2504.04915 • Published Apr 7, 2025
MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at Scale Paper • 2506.04405 • Published Jun 4, 2025 • 7
MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at Scale Paper • 2506.04405 • Published Jun 4, 2025 • 7