OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference Paper • 2502.18411 • Published 1 day ago • 54
Phantom: Subject-consistent video generation via cross-modal alignment Paper • 2502.11079 • Published 11 days ago • 51
Running 1.67k 1.67k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices Paper • 2502.04363 • Published 22 days ago • 11
Magic 1-For-1: Generating One Minute Video Clips within One Minute Paper • 2502.07701 • Published 15 days ago • 32
ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features Paper • 2502.04320 • Published 20 days ago • 33
DynVFX: Augmenting Real Videos with Dynamic Content Paper • 2502.03621 • Published 21 days ago • 27
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer Paper • 2502.01105 • Published 24 days ago • 19
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 22 days ago • 195
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published 24 days ago • 183
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Paper • 2501.12895 • Published Jan 22 • 56
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 333
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published Jan 16 • 70
Diffusion Adversarial Post-Training for One-Step Video Generation Paper • 2501.08316 • Published Jan 14 • 33
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 273