VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping Paper • 2412.11279 • Published 10 days ago • 12
Causal Diffusion Transformers for Generative Modeling Paper • 2412.12095 • Published 9 days ago • 23 • 3
Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel Paper • 2412.08467 • Published 15 days ago • 5
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM Paper • 2412.09618 • Published 13 days ago • 21
TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration Paper • 2410.12183 • Published Oct 16 • 3
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published Jun 10 • 65
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction Paper • 2404.02905 • Published Apr 3 • 65
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding Paper • 2403.15377 • Published Mar 22 • 22
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding Paper • 2403.15377 • Published Mar 22 • 22