SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization Paper • 2604.02268 • Published 3 days ago • 81
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills Paper • 2603.25158 • Published 10 days ago • 48
Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs? Paper • 2603.24472 • Published 11 days ago • 48
Online Experiential Learning for Language Models Paper • 2603.16856 • Published 19 days ago • 57
Code-A1: Adversarial Evolving of Code LLM and Test LLM via Reinforcement Learning Paper • 2603.15611 • Published 20 days ago • 10
LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory Paper • 2603.03269 • Published Mar 3 • 62
DreamWorld: Unified World Modeling in Video Generation Paper • 2603.00466 • Published Feb 28 • 16
Heterogeneous Agent Collaborative Reinforcement Learning Paper • 2603.02604 • Published Mar 3 • 191
Internalizing Meta-Experience into Memory for Guided Reinforcement Learning in Large Language Models Paper • 2602.10224 • Published Feb 10 • 19
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published Feb 9 • 72
SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Models Paper • 2602.04208 • Published Feb 4 • 19
InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning Paper • 2602.06960 • Published Feb 6 • 14
MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic Environments Paper • 2602.06075 • Published Feb 3 • 13
AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration Paper • 2602.03786 • Published Feb 3 • 90
SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training Paper • 2602.03411 • Published Feb 3 • 39
Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation Paper • 2602.03619 • Published Feb 3 • 28
SWE-World: Building Software Engineering Agents in Docker-Free Environments Paper • 2602.03419 • Published Feb 3 • 41