LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning Paper • 2505.16933 • Published 2 days ago • 23
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning Paper • 2505.14231 • Published 5 days ago • 48
Awaker2.5-VL: Stably Scaling MLLMs with Parameter-Efficient Mixture of Experts Paper • 2411.10669 • Published Nov 16, 2024 • 10
MMRole: A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents Paper • 2408.04203 • Published Aug 8, 2024