Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published about 17 hours ago • 15
Running 2.23k 2.23k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think Paper • 2503.00948 • Published 11 days ago • 3
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think Paper • 2503.00948 • Published 11 days ago • 3
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Paper • 2502.16894 • Published 17 days ago • 27
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Paper • 2502.16894 • Published 17 days ago • 27
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model Paper • 2411.19108 • Published Nov 28, 2024 • 19
Running on Zero 1.88k 1.88k Chat With Janus-Pro-7B 🌍 A unified multimodal understanding and generation model.
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Paper • 2501.12895 • Published Jan 22 • 57