How Programming Concepts and Neurons Are Shared in Code Language Models Paper • 2506.01074 • Published 4 days ago • 3
Tracing Multilingual Factual Knowledge Acquisition in Pretraining Paper • 2505.14824 • Published 16 days ago • 4
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated Apr 28 • 616
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 232
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31 • 285
— UI is a good thing 💅 — Collection cool spaces with a cool UI, what could be better? • 5 items • Updated May 5 • 19
MMTEB Collection Our contribution to the Massive Multilingual Text Embedding Benchmark (MMTEB). Retrieval and reranking benchmarks in 16 languages. • 4 items • Updated Jun 6, 2024 • 3
CommonCrawl Collection Large web-mined general corpus based on CommonCrawl. • 8 items • Updated Apr 13 • 3
view article Article Finding Moroccan Arabic (Darija) in Fineweb 2 By omarkamali and 3 others • Dec 8, 2024 • 23