view article Article Building the Hugging Face MCP Server By evalstate and 3 others • 10 days ago • 41
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 12 days ago • 556
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 10 days ago • 205
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Apr 28 • 506
EXAONE-Deep Collection EXAONE reasoning model series of 2.4B, 7.8B, and 32B, optimized for reasoning tasks including math and coding • 10 items • Updated 13 days ago • 92
steiner-preview Collection Reasoning models trained on synthetic data using reinforcement learning. • 3 items • Updated Oct 20, 2024 • 32
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub By jsulz and 3 others • Feb 12 • 69
view article Article SmolVLM Grows Smaller – Introducing the 250M & 500M Models! By andito and 2 others • Jan 23 • 182
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais and 2 others • Nov 13, 2024 • 102
Gemma 2 JPN Release Collection A Gemma 2 2B model fine-tuned on Japanese text. It supports the Japanese language the same level of performance of EN only queries on Gemma 2. • 3 items • Updated 10 days ago • 29
Gemma 2 2B Release Collection The 2.6B parameter version of Gemma 2. • 6 items • Updated 10 days ago • 81
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation Paper • 2312.14187 • Published Dec 20, 2023 • 52