Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published 6 days ago • 107
Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning Paper • 2505.01441 • Published 14 days ago • 34
LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis Paper • 2505.02625 • Published 7 days ago • 20
Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks Paper • 2505.00234 • Published 11 days ago • 22
DeepCritic: Deliberate Critique with Large Language Models Paper • 2505.00662 • Published 10 days ago • 48
Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think Paper • 2504.20708 • Published 13 days ago • 22
WebThinker: Empowering Large Reasoning Models with Deep Research Capability Paper • 2504.21776 • Published 11 days ago • 43
NEMOTRON-CROSSTHINK: Scaling Self-Learning beyond Math Reasoning Paper • 2504.13941 • Published 26 days ago • 10
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning Paper • 2504.19162 • Published 15 days ago • 15
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published 23 days ago • 121
ReZero: Enhancing LLM search ability by trying one-more-time Paper • 2504.11001 • Published 27 days ago • 14
AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference Paper • 2504.10326 • Published 28 days ago • 25
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models Paper • 2504.11468 • Published Apr 10 • 28