Running Featured 62 Distilling 100B+ Models 40x Faster with TRL π 62 TRL distillation for 100B+ teachers, 40x faster
arcee-ai/Trinity-Large-Thinking Text Generation β’ 399B β’ Updated 9 days ago β’ 18.6k β’ β’ 157
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 Text Generation β’ 124B β’ Updated 7 days ago β’ 551k β’ 332
Running on CPU Upgrade 221 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens π 221 Explore synthetic data experiments on a virtual bookshelf