The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper โข 2506.05209 โข Published 30 days ago โข 42
SmolVLM: Redefining small and efficient multimodal models Paper โข 2504.05299 โข Published Apr 7 โข 192
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper โข 2502.02737 โข Published Feb 4 โข 235
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Paper โข 2405.18392 โข Published May 28, 2024 โข 12