Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation Paper • 2412.06016 • Published Dec 8, 2024 • 20
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 103
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper • 2412.15322 • Published Dec 19, 2024 • 18
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up Paper • 2412.16112 • Published Dec 20, 2024 • 23
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning Paper • 2504.17192 • Published Apr 24 • 110
Vid2World: Crafting Video Diffusion Models to Interactive World Models Paper • 2505.14357 • Published 6 days ago • 21