Step 1: Reproducing DeepSeek's Distilled Models

open-r1 's Collections

updated 23 days ago

Code for training and evaluation: https://github.com/huggingface/open-r1

open-r1/Mixture-of-Thoughts

Viewer • Updated 23 days ago • 699k • 38.2k • 228

Note A curated reasoning dataset that reproduces the performance of DeepSeek's distilled models
open-r1/OpenR1-Distill-7B

Text Generation • Updated 23 days ago • 2.94k • 14

Note The SFT model trained on Mixture-of-Thoughts that replicates the performance of deepseek-ai/DeepSeek-R1-Distill-Qwen-7B, starting from the same base model
open-r1/Qwen2.5-Math-7B-RoPE-300k

Text Generation • Updated 27 days ago • 7.47k • 2

Note The base model used to optimise the data mixture of Mixture-of-Thoughts