4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation Paper • 2512.17012 • Published 6 days ago • 38
Towards Cross-View Point Correspondence in Vision-Language Models Paper • 2512.04686 • Published 20 days ago
RoboOS-NeXT: A Unified Memory-based Framework for Lifelong, Scalable, and Robust Multi-Robot Collaboration Paper • 2510.26536 • Published Oct 30
RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics Paper • 2512.13660 • Published 9 days ago • 36
RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics Paper • 2512.13660 • Published 9 days ago • 36
RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics Paper • 2512.13660 • Published 9 days ago • 36
EditThinker: Unlocking Iterative Reasoning for Any Image Editor Paper • 2512.05965 • Published 19 days ago • 38
Geometrically-Constrained Agent for Spatial Reasoning Paper • 2511.22659 • Published 27 days ago • 40
TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics Paper • 2510.07181 • Published Oct 8 • 1
TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics Paper • 2510.07181 • Published Oct 8 • 1
RoboRefer & RefSpatial Collection RoboRefer weights, RefSpatial Dataset and RefSpatial-Bench • 9 items • Updated Oct 24 • 3
LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions Paper • 2510.08211 • Published Oct 9 • 22
How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective Paper • 2509.18905 • Published Sep 23 • 29