Cosmos Tokenizer Collection A suite of image and video tokenizers • 10 items • Updated 3 days ago • 11
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 8 items • Updated 5 days ago • 160
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Paper • 2409.11355 • Published Sep 17 • 28
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders Paper • 2410.22366 • Published 12 days ago • 71
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think Paper • 2410.06940 • Published Oct 9 • 4
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 8 items • Updated 3 days ago • 89
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation Paper • 2410.13861 • Published 23 days ago • 53
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines Paper • 2410.12705 • Published 24 days ago • 29
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25 • 73
LayerSkip Collection Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710 • 8 items • Updated 14 days ago • 42
Autoregressive Speech Synthesis without Vector Quantization Paper • 2407.08551 • Published Jul 11 • 14