Seamless Communication Collection A significant step towards removing language barriers through expressive, fast and high-quality AI translation. • 16 items • Updated Jan 16 • 144
Jamba-1.5 Collection The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated 28 days ago • 71
Gemma Scope Release Collection A comprehensive, open suite of sparse autoencoders for Gemma 2 2B and 9B. • 10 items • Updated Aug 11 • 13
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 169
Llama 3.1 Collection This collection hosts the transformers and original repos of the Meta Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Aug 2 • 569
SSMs Collection A collection of Mamba-2-based research models with 8B parameters trained on 3.5T tokens for comparison with Transformers. • 5 items • Updated Jul 17 • 23
LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated Jun 27 • 147
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24 • 166
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated Jul 17 • 156
Stable Diffusion 3 Collection Stable Diffusion 3 and related models for text-to-image and image-to-image • 2 items • Updated Jun 12 • 86
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14 • 123
C4AI Aya 23 Collection Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 4 items • Updated Aug 6 • 45
view article Article Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens tokens and 11 languages May 24 • 24
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 27 items • Updated 1 day ago • 460
Granite Speculators Collection A collection of accelerators for the Granite LLM/Code family of models • 5 items • Updated Aug 13 • 5
Granite Time Series Models Collection A collection of time series models trained by IBM licensed under Apache 2.0 license. • 4 items • Updated Jul 19 • 10
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 23 items • Updated 20 days ago • 155
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated Jul 31 • 133
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation Paper • 2403.04692 • Published Mar 7 • 40
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models Paper • 2401.05252 • Published Jan 10 • 45
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis Paper • 2310.00426 • Published Sep 30, 2023 • 61
PixArt-Sigma Collection This collection contains all the PixArt-Sigma related models, spaces and so on. • 9 items • Updated May 4 • 5
PixArt-Alpha Collection This collection organize all the PixArt-Alpha related models, datasets and so on. • 9 items • Updated May 4 • 2
PixArt-Delta Collection This collection contains all the PixArt-Delta related models, spaces and so on. • 4 items • Updated May 4 • 3
Arctic Collection A collection of pre-trained dense-MoE Hybrid transformer models • 2 items • Updated Apr 24 • 22
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 250
Breeze-7B Collection Breeze-7B is a language model family that builds on top of Mistral-7B, specifically intended for Traditional Chinese use. • 8 items • Updated Apr 25 • 9
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Aug 2 • 673
C4AI Command R Collection C4AI Command-R is a research release of a 35 billion parameter highly performant generative model. Command-R is a large language model with open weigh • 4 items • Updated 20 days ago • 15
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 489
Latent Consistency Models LoRAs Collection Latent Consistency Models for Stable Diffusion - LoRAs and full fine-tuned weights • 4 items • Updated Nov 10, 2023 • 98