WorldSimBench: Towards Video Generation Models as World Simulators Paper • 2410.18072 • Published 18 days ago • 16
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion Paper • 2406.03184 • Published Jun 5 • 18
CityDreamer: Compositional Generative Model of Unbounded 3D Cities Paper • 2309.00610 • Published Sep 1, 2023 • 18
ChEF: A Comprehensive Evaluation Framework for Standardized Assessment of Multimodal Large Language Models Paper • 2311.02692 • Published Nov 5, 2023 • 1
LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark Paper • 2306.06687 • Published Jun 11, 2023 • 1
MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception Paper • 2312.07472 • Published Dec 12, 2023 • 2
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities Paper • 2401.15071 • Published Jan 26 • 34