Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't Paper • 2503.16219 • Published Mar 20 • 47
Running 920 920 FineWeb: decanting the web for the finest text data at scale 🍷 Generate high-quality web text data for LLM training