The Ultimate Collection of Code Classifiers Collection 🔥 15 classifiers, 124M parameters, one per programming language— for assessing the educational value of GitHub code • 15 items • Updated 7 days ago • 10
Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study Paper • 2502.02481 • Published 22 days ago • 9
GemmaX2 Collection GemmaX2 language models, including pretrained and instruction-tuned models of 2 sizes, including 2B, 9B. • 7 items • Updated 20 days ago • 18
MMTEB Collection A collection of items telated the the MMTEB release • 2 items • Updated 6 days ago • 1
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 6 days ago • 115
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? Paper • 2502.12115 • Published 9 days ago • 41
0x Lite Collection 0x Lite is the next model in the 0x Mini generation, it outputs extremely high quality results. • 2 items • Updated Jan 25 • 4
Reverb Collection Ozone's most advanced series of AI language models yet. • 5 items • Updated 7 days ago • 1
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 12 items • Updated 7 days ago • 84
OpenR1-Math Collection Dataset and SFT model distilled from DeepSeek-R1. Check out our blog post for more details: https://huggingface.co/blog/open-r1/update-2 • 3 items • Updated 12 days ago • 6