SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning Paper • 2506.01713 • Published 5 days ago • 31
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment Paper • 2504.15585 • Published Apr 22 • 13
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper • 2412.08737 • Published Dec 11, 2024 • 54