InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption Paper β’ 2412.09283 β’ Published 14 days ago β’ 19