Frame In-N-Out: Unbounded Controllable Image-to-Video Generation Paper • 2505.21491 • Published 30 days ago • 17
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent Paper • 2309.12311 • Published Sep 21, 2023 • 17
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination Paper • 2406.05132 • Published Jun 7, 2024 • 31