SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward Paper • 2505.17018 • Published 2 days ago • 12