LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry Paper • 2512.19629 • Published 16 days ago • 25
LongVideoAgent: Multi-Agent Reasoning with Long Videos Paper • 2512.20618 • Published 14 days ago • 53
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations Paper • 2512.21004 • Published 14 days ago • 12
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models Paper • 2512.20557 • Published 15 days ago • 49
Omni-Weather: Unified Multimodal Foundation Model for Weather Generation and Understanding Paper • 2512.21643 • Published 13 days ago • 11