Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations Paper • 2506.18898 • Published 2 days ago • 23
MME-Reasoning: A Comprehensive Benchmark for Logical Reasoning in MLLMs Paper • 2505.21327 • Published 30 days ago • 83
MME-Reasoning: A Comprehensive Benchmark for Logical Reasoning in MLLMs Paper • 2505.21327 • Published 30 days ago • 83
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward Paper • 2505.17018 • Published May 22 • 15
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification Paper • 2505.16938 • Published May 22 • 117
SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing Paper • 2503.04629 • Published Mar 6 • 18
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training Paper • 2412.11863 • Published Dec 16, 2024 • 4
Chimera: Improving Generalist Model with Domain-Specific Experts Paper • 2412.05983 • Published Dec 8, 2024 • 9
Chimera: Improving Generalist Model with Domain-Specific Experts Paper • 2412.05983 • Published Dec 8, 2024 • 9