view article Article FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages By davanstrien and 5 others • Jul 8 • 29
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • Jul 8 • 629
view article Article FineWeb2-C: Help Build Better Language Models in Your Language By davanstrien and 5 others • Dec 23, 2024 • 21