BioInsight: Multi-Agent Orchestration for Interactive Biomedical Knowledge Discovery Paper • 2606.20997 • Published 13 days ago • 6
Trimming the Long-Tail of Visual World Modeling Evaluation Paper • 2606.24256 • Published 9 days ago • 36
GBC: Gradient-Based Connections for Optimizing Multi-Agent Systems Paper • 2606.28187 • Published 6 days ago • 12
PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems Paper • 2606.22388 • Published 11 days ago • 96
AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints Paper • 2606.05622 • Published 28 days ago • 44
Advancing Creative Physical Intelligence in Large Multimodal Models Paper • 2605.26396 • Published May 25 • 21
Useful Memories Become Faulty When Continuously Updated by LLMs Paper • 2605.12978 • Published May 13 • 19
CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing Paper • 2605.02910 • Published May 6 • 23
PEARL: Self-Evolving Assistant for Time Management with Reinforcement Learning Paper • 2601.11957 • Published Jan 28 • 3
MedSAM3: Delving into Segment Anything with Medical Concepts Paper • 2511.19046 • Published Nov 24, 2025 • 55
Where LLM Agents Fail and How They can Learn From Failures Paper • 2509.25370 • Published Sep 29, 2025 • 12