Can Agent Conquer Web? Exploring the Frontiers of ChatGPT Atlas Agent in Web Games Paper • 2510.26298 • Published Oct 30, 2025 • 45
UniREditBench: A Unified Reasoning-based Image Editing Benchmark Paper • 2511.01295 • Published Nov 3, 2025 • 38
EBT-Policy: Energy Unlocks Emergent Physical Reasoning Capabilities Paper • 2510.27545 • Published Oct 31, 2025 • 48
When Modalities Conflict: How Unimodal Reasoning Uncertainty Governs Preference Dynamics in MLLMs Paper • 2511.02243 • Published Nov 4, 2025 • 24
Don't Blind Your VLA: Aligning Visual Representations for OOD Generalization Paper • 2510.25616 • Published Oct 29, 2025 • 95
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought Paper • 2511.02779 • Published Nov 4, 2025 • 58