GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper โข 2507.01006 โข Published Jul 1 โข 228