open-r1/Mixture-of-Thoughts
Viewer
•
Updated
•
699k
•
8.58k
•
116
Code for training and evaluation: https://github.com/huggingface/open-r1
Note A curated reasoning dataset that reproduces the performance of DeepSeek's distilled models
Note The SFT model trained on Mixture-of-Thoughts that replicates the performance of deepseek-ai/DeepSeek-R1-Distill-Qwen-7B, starting from the same base model
Note The base model used to optimise the data mixture of Mixture-of-Thoughts