I-Con: A Unifying Framework for Representation Learning Paper • 2504.16929 • Published about 21 hours ago • 17
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning Paper • 2504.16080 • Published 2 days ago • 9
Vidi: Large Multimodal Models for Video Understanding and Editing Paper • 2504.15681 • Published 2 days ago • 13
NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors Paper • 2504.11427 • Published 9 days ago • 17
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft Paper • 2504.08388 • Published 13 days ago • 39
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks Paper • 2504.05118 • Published 17 days ago • 25
NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representations Paper • 2503.23162 • Published 26 days ago • 11
FreSca: Unveiling the Scaling Space in Diffusion Models Paper • 2504.02154 • Published 22 days ago • 18
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance Paper • 2504.01724 • Published 22 days ago • 64
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness Paper • 2503.22677 • Published 27 days ago • 6
Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data Paper • 2503.21694 • Published 28 days ago • 16
ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model Paper • 2503.21144 • Published 28 days ago • 25
Enabling Versatile Controls for Video Diffusion Models Paper • 2503.16983 • Published Mar 21 • 15