A collection for Multimodal Reasoning Models and Benchmarks.
-
Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models
Paper • 2502.16033 • Published • 18 -
rippleripple/MMIR
Viewer • Updated • 534 • 65 • 2 -
LLaVA-o1: Let Vision Language Models Reason Step-by-Step
Paper • 2411.10440 • Published • 125 -
GRIT: Teaching MLLMs to Think with Images
Paper • 2505.15879 • Published • 8