V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper • 2504.06148 • Published 18 days ago • 13
When Preferences Diverge: Aligning Diffusion Models with Minority-Aware Adaptive DPO Paper • 2503.16921 • Published Mar 21 • 6
LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer Paper • 2412.13871 • Published Dec 18, 2024 • 18