OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published 2 days ago • 83
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer Paper • 2501.18427 • Published 5 days ago • 14
Fast Encoder-Based 3D from Casual Videos via Point Track Processing Paper • 2404.07097 • Published Apr 10, 2024 • 2
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 26 days ago • 87
Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers Paper • 2501.03931 • Published 28 days ago • 14
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Paper • 2501.03847 • Published 28 days ago • 23
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos Paper • 2501.04001 • Published 28 days ago • 42
VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control Paper • 2412.20800 • Published Dec 30, 2024 • 10
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Paper • 2501.01427 • Published Jan 2 • 49
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation Paper • 2412.21059 • Published Dec 30, 2024 • 18
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up Paper • 2412.16112 • Published Dec 20, 2024 • 22
VidTwin: Video VAE with Decoupled Structure and Dynamics Paper • 2412.17726 • Published Dec 23, 2024 • 8
VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models Paper • 2412.19645 • Published Dec 27, 2024 • 13
DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes Paper • 2412.11100 • Published Dec 15, 2024 • 7
ColorFlow: Retrieval-Augmented Image Sequence Colorization Paper • 2412.11815 • Published Dec 16, 2024 • 26