view article Article Fine-tune ModernBERT for RAG with Synthetic Data By sdiazlor and 2 others • Jan 20 • 37
view article Article FineWeb2-C: Help Build Better Language Models in Your Language By davanstrien and 5 others • Dec 23, 2024 • 19
view article Article Introducing the Synthetic Data Generator - Build Datasets with Natural Language By davidberenstein1957 and 5 others • Dec 16, 2024 • 121
view article Article Open Preference Dataset for Text-to-Image Generation by the 🤗 Community By davidberenstein1957 and 6 others • Dec 9, 2024 • 56
view article Article Let’s make a generation of amazing image generation models By burtenshaw and 4 others • Nov 26, 2024 • 33
view article Article Argilla 2.4: Easily Build Fine-Tuning and Evaluation datasets on the Hub — No Code Required By nataliaElv and 2 others • Nov 4, 2024 • 42
view article Article How to build a custom text classifier without days of human labeling By sdiazlor and 4 others • Oct 17, 2024 • 55
view article Article How to optimize your data labelling project with custom interfaces By burtenshaw and 9 others • Oct 16, 2024 • 18
view article Article 🔥 Argilla 2.0: the data-centric tool for AI makers 🤗 By dvilasuero • Jul 30, 2024 • 37
view article Article Llama 3.1 - 405B, 70B & 8B with multilinguality and long context By philschmid and 7 others • Jul 23, 2024 • 232
view article Article How we leveraged distilabel to create an Argilla 2.0 Chatbot By plaguss and 4 others • Jul 16, 2024 • 33
view article Article Ethics and Society Newsletter #6: Building Better AI: The Importance of Data Quality By evijit and 9 others • Jun 24, 2024 • 34
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • Jun 4, 2024 • 78