π Celebrating One Year of #SauerkrautLM with Two Groundbreaking Releases!
We're thrilled to announce the release of SauerkrautLM-v2-14b in two specialized versions: VAGOsolutions/SauerkrautLM-v2-14b-SFT and VAGOsolutions/SauerkrautLM-v2-14b-DPO. Built on the robust Qwen2.5-14B foundation, these models represent a significant leap forward in multilingual AI capabilities.
π¬ Technical Breakthroughs: π Innovative three-phase Fine-Tuning approach π Two-step Spectrum SFT + one-step Spectrum DPO optimization phase for enhanced performance π Balance of German and English language capabilities π Advanced function calling - almost on par with Claude-3.5-Sonnet-20240620
π Training Innovation: Our three-phase approach targeted specific layer percentages (15%, 20% and 25%) with carefully curated datasets, including: π Mathematics-focused content (proprietary classifier-selected) π High-quality German training data π Specialized function calling datasets π Premium multilingual content
π Community Contribution: We're also releasing two new datasets in a few days: 1οΈβ£ SauerkrautLM-Fermented-GER-DPO: 3,300 high-quality German training samples 2οΈβ£ SauerkrautLM-Fermented-Irrelevance-GER-DPO: 2,000 specialized samples for optimized function call irrelevance handling
Thank you to our incredible community and partners who have supported us throughout this journey. Here's to another year of AI innovation!Β π
π Progress in the German FineWeb edu reproduction π
We're delighted to share the launch of our new Data Quality Classification Model, designed specifically for evaluating educational content in German. This tool uses advanced machine learning techniques to assess texts across all educational levels, from primary school to university.
π Inspired by Huggingface's fine web edu dataset, we've worked hard to refine data classification methods ensuring educators and learners access top-quality resources. We're excited about the future as we continue improving our models and expanding our datasets.
π A huge thank you to David and Daryoush from Vago Solutions; BjΓΆrn and Jan from Ellamind / DiscoResearch for their expert insights throughout this project. Your support has been crucial. This project was made possible by the support of PrimeLine AI.