SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks Paper • 2606.09669 • Published 27 days ago • 47
Imagination Helps Visual Reasoning, But Not Yet in Latent Space Paper • 2602.22766 • Published Feb 26 • 45
AgenticDataBench: A Comprehensive Benchmark for Data Agents Paper • 2607.01647 • Published 3 days ago • 25
WorldDirector: Building Controllable World Simulators with Persistent Dynamic Memory Paper • 2607.02517 • Published 3 days ago • 20
Program-as-Weights: A Programming Paradigm for Fuzzy Functions Paper • 2607.02512 • Published 3 days ago • 68
RoadBench: Benchmarking MLLMs on Fine-Grained Spatial Understanding and Reasoning under Urban Road Scenarios Paper • 2511.18011 • Published Nov 22, 2025
UrbanWell: Benchmarking Multimodal Large Language Models for Spatio-Temporal Urban Wellbeing Analytics Paper • 2606.15890 • Published 21 days ago
Agentic Abstention: Do Agents Know When to Stop Instead of Act? Paper • 2606.28733 • Published 8 days ago • 141
EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments Paper • 2606.13681 • Published 24 days ago • 142
SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning Paper • 2606.13673 • Published 24 days ago • 110
ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time? Paper • 2606.05553 • Published about 1 month ago • 50
Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution Paper • 2606.06492 • Published about 1 month ago • 95
GRAIL: Generating Humanoid Loco-Manipulation from 3D Assets and Video Priors Paper • 2606.05160 • Published Jun 3 • 8
SpatialAct: Probing Spatial Reasoning-to-Action Capabilities of VLM Agents in 3D Scenes Paper • 2605.31148 • Published May 29 • 3