MMRole: A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents Paper • 2408.04203 • Published Aug 8, 2024 • 1
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning Paper • 2505.16933 • Published May 22 • 32
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning Paper • 2505.14231 • Published May 20 • 51
Awaker2.5-VL: Stably Scaling MLLMs with Parameter-Efficient Mixture of Experts Paper • 2411.10669 • Published Nov 16, 2024 • 10
MMRole: A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents Paper • 2408.04203 • Published Aug 8, 2024 • 1