FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper โข 2506.20920 โข Published Jun 26 โข 64
view post Post 3756 Meet our new agentic model : ๐๐ฒ๐๐๐๐ฟ๐ฎ๐นDevstral is an open-source LLM built software engineering tasks built under a collaboration between Mistral AI and All Hands AI ๐.๐๐ฒ๐ ๐ณ๐ฒ๐ฎ๐๐๐ฟ๐ฒ๐ :โข ๐ค ๐๐ด๐ฒ๐ป๐๐ : perfect for Agentic codingโข ๐ ๐น๐ถ๐ด๐ต๐๐๐ฒ๐ถ๐ด๐ต๐: Devstral is a ๐ฎ๐ฐ๐ parameter based on Mistral small. โข ยฉ๏ธ ๐๐ฝ๐ฎ๐ฐ๐ต๐ฒ ๐ฎ.๐ฌ, meaning fully open-source !โข ๐ A ๐ญ๐ฎ๐ด๐ธ context window.๐Blog : https://mistral.ai/news/devstralโกAPI : The model is also available on our API under the name ๐ฑ๐ฒ๐๐๐๐ฟ๐ฎ๐น-๐๐บ๐ฎ๐น๐น-๐ฎ๐ฑ๐ฌ๐ฑ๐ค repo : mistralai/Devstral-Small-2505Can't wait to see what you will build with it ! See translation 1 reply ยท ๐ฅ 5 5 ๐ 4 4 + Reply
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper โข 2502.02737 โข Published Feb 4 โข 241
Towards Best Practices for Open Datasets for LLM Training Paper โข 2501.08365 โข Published Jan 14 โข 64
view post Post 7659 Everchanging Quest is out !It is an LLM controlled Rogue-Like in which the LLM gets a markdown representation of the map, and should generate a JSON with the objective to fulfill on the map as well as the necessary objects and their placements.Come test it on the space : Jofthomas/Everchanging-Quest 2 replies ยท ๐ฅ 24 24 ๐ 11 11 ๐ 4 4 ๐ง 1 1 โค๏ธ 1 1 ๐ 1 1 ๐คฏ 1 1 ๐ค 1 1 + Reply
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper โข 2406.17557 โข Published Jun 25, 2024 โข 98
A Dataset and Strong Baselines for Classification of Czech News Texts Paper โข 2307.10666 โข Published Jul 20, 2023