view article Article Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK By davidberenstein1957 • Nov 21 • 34
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais • Nov 13 • 98
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated 29 days ago • 99
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated 28 days ago • 351
📀 Dataset comparison models Collection 1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12 • 34
Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck Paper • 2404.07647 • Published Apr 11 • 4
OpenCulture Collection A multilingual dataset of public domain books and newspapers. • 27 items • Updated Nov 6 • 121
TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese Paper • 2401.16640 • Published Jan 30 • 8
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 218
Apple MLX-compatible 7B LLMs on the 🤗 Hub Collection This collection contains the model weights for 7B LLMs for Apple's MLX framework. Find more information at https://github.com/ml-explore/mlx • 8 items • Updated Sep 2 • 9