CVE-Factory: Scaling Expert-Level Agentic Tasks for Code Security Vulnerability Paper • 2602.03012 • Published Feb 3 • 2
CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs Paper • 2602.03048 • Published Feb 3 • 32
ToolACE-MCP: Generalizing History-Aware Routing from MCP Tools to the Agent Web Paper • 2601.08276 • Published Jan 13 • 7
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published Jan 11 • 214
ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping Paper • 2510.08457 • Published Oct 9, 2025 • 13