DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 122
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 122
Running 2.66k 2.66k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters