Running 2.15k 2.15k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published Jan 14 • 56
Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published Feb 3 • 39
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 200