SRPO: Self-Referential Policy Optimization for Vision-Language-Action Models Paper • 2511.15605 • Published Nov 19, 2025 • 22
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision Paper • 2505.13427 • Published May 19, 2025 • 26
CoMP: Continual Multimodal Pre-training for Vision Foundation Models Paper • 2503.18931 • Published Mar 24, 2025 • 30