LLaVA-Video Collection Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 8 items • Updated 17 days ago • 60
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency Paper • 2502.09621 • Published 24 days ago • 27