Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation Paper β’ 2508.07981 β’ Published 14 days ago β’ 58
Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation Paper β’ 2508.07901 β’ Published 14 days ago β’ 38
ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing Paper β’ 2508.10881 β’ Published 11 days ago β’ 50
Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset Paper β’ 2506.18851 β’ Published Jun 23 β’ 29
OmniGen2: Exploration to Advanced Multimodal Generation Paper β’ 2506.18871 β’ Published Jun 23 β’ 74
Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition Paper β’ 2506.17201 β’ Published Jun 20 β’ 55
SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning Paper β’ 2506.15154 β’ Published Jun 18 β’ 8
Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model Paper β’ 2506.13642 β’ Published Jun 16 β’ 27
PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework Paper β’ 2506.10741 β’ Published Jun 12 β’ 27
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation Paper β’ 2506.09350 β’ Published Jun 11 β’ 48
HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation Paper β’ 2504.21650 β’ Published Apr 30 β’ 16
FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios Paper β’ 2505.03730 β’ Published May 6 β’ 28
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation Paper β’ 2504.14899 β’ Published Apr 21 β’ 21
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance Paper β’ 2504.06232 β’ Published Apr 8 β’ 14
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper β’ 2504.08685 β’ Published Apr 11 β’ 130
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Paper β’ 2504.02160 β’ Published Apr 2 β’ 38
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper β’ 2504.06263 β’ Published Apr 8 β’ 181
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis Paper β’ 2504.04842 β’ Published Apr 7 β’ 36