Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning Paper • 2507.06485 • Published Jul 9 • 4
4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time Paper • 2506.18890 • Published Jun 23 • 6
MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation Paper • 2506.17113 • Published Jun 20 • 4
Executable Functional Abstractions: Inferring Generative Programs for Advanced Math Problems Paper • 2504.09763 • Published Apr 14 • 13
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization Paper • 2504.08641 • Published Apr 11 • 7