ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs Paper • 2506.10128 • Published 22 days ago • 22 • 2
MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning Paper • 2506.05523 • Published 28 days ago • 33 • 2
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement Paper • 2504.07934 • Published Apr 10 • 19 • 2
Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension Paper • 2412.03704 • Published Dec 4, 2024 • 7 • 2