MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing Paper • 2502.21291 • Published 9 days ago • 4
MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing Paper • 2502.21291 • Published 9 days ago • 4 • 2
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Paper • 2502.20172 • Published 11 days ago • 26
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Paper • 2502.20172 • Published 11 days ago • 26 • 3
FantasyID: Face Knowledge Enhanced ID-Preserving Video Generation Paper • 2502.13995 • Published 19 days ago • 8
FantasyID: Face Knowledge Enhanced ID-Preserving Video Generation Paper • 2502.13995 • Published 19 days ago • 8 • 2
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published 23 days ago • 51
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer Paper • 2502.05979 • Published 28 days ago • 8
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer Paper • 2502.05979 • Published 28 days ago • 8 • 2
EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion Paper • 2501.13452 • Published Jan 23 • 7
EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion Paper • 2501.13452 • Published Jan 23 • 7 • 2
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step Paper • 2501.13926 • Published Jan 23 • 37