Think Only When You Need with Large Hybrid-Reasoning Models Paper • 2505.14631 • Published 4 days ago • 18
Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning Paper • 2505.13866 • Published 4 days ago • 14
Improving Assembly Code Performance with Large Language Models via Reinforcement Learning Paper • 2505.11480 • Published 8 days ago • 7
AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning Paper • 2505.11896 • Published 7 days ago • 54
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper • 2505.10554 • Published 9 days ago • 113
Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning Paper • 2505.01441 • Published 26 days ago • 36
WebThinker: Empowering Large Reasoning Models with Deep Research Capability Paper • 2504.21776 • Published 24 days ago • 53
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published 25 days ago • 91
NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes Paper • 2504.11544 • Published Apr 15 • 42
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published Apr 14 • 84
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning Paper • 2504.07128 • Published Apr 2 • 84
Search-R1-v0.2 Collection Exploration with a more stable RL pipeline with outcome-only reward and scaled-up LLMs. https://arxiv.org/abs/2503.09516 • 25 items • Updated about 17 hours ago • 3
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay Paper • 2504.03601 • Published Apr 4 • 16
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models Paper • 2504.04718 • Published Apr 7 • 41
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks Paper • 2504.05118 • Published Apr 7 • 25