HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation for Multiple Characters Paper • 2505.20156 • Published 1 day ago • 1
HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation Paper • 2503.18860 • Published Mar 24 • 5
LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models Paper • 2505.19223 • Published 2 days ago • 6
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression Paper • 2505.19602 • Published 1 day ago • 9
Packing Input Frame Context in Next-Frame Prediction Models for Video Generation Paper • 2504.12626 • Published Apr 17 • 50
Direct3D-S2: Gigascale 3D Generation Made Easy with Spatial Sparse Attention Paper • 2505.17412 • Published 5 days ago • 14
MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 4 items • Updated 5 days ago • 129
Model Already Knows the Best Noise: Bayesian Active Noise Selection via Attention in Video Diffusion Model Paper • 2505.17561 • Published 4 days ago • 27
Training-Free Efficient Video Generation via Dynamic Token Carving Paper • 2505.16864 • Published 5 days ago • 21
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation Paper • 2503.09641 • Published Mar 12 • 39
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch By ariG23498 and 6 others • 7 days ago • 97