Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning Paper โข 2506.04207 โข Published 20 days ago โข 45 โข 4