StreamDiT: Real-Time Streaming Text-to-Video Generation Paper • 2507.03745 • Published 5 days ago • 20
ERNIE 4.5 Collection collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 23 items • Updated 6 days ago • 144
Kontext Dev LoRAs Collection Collection of Kontext Dev LoRAs by fal • 19 items • Updated 2 days ago • 9
ARIG: Autoregressive Interactive Head Generation for Real-time Conversations Paper • 2507.00472 • Published 8 days ago • 10
Audio-Sync Video Generation with Multi-Stream Temporal Control Paper • 2506.08003 • Published 30 days ago • 3
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper • 2412.15322 • Published Dec 19, 2024 • 19
Holo1 Collection Vision-Language Action Model for use in Surfer-H web navigation agent • 6 items • Updated 28 days ago • 48
Style Customization of Text-to-Vector Generation with Image Diffusion Priors Paper • 2505.10558 • Published May 15 • 15
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation Paper • 2505.04512 • Published May 7 • 36
FaceID-6M: A Large-Scale, Open-Source FaceID Customization Dataset Paper • 2503.07091 • Published Mar 10 • 3
h-Edit: Effective and Flexible Diffusion-Based Editing via Doob's h-Transform Paper • 2503.02187 • Published Mar 4 • 5