SYNTHETIC-1
Collection
A collection of tasks & verifiers for reasoning datasets
•
9 items
•
Updated
•
44
SYNTHETIC-1-7B-SFT is an initial model trained on the SFT subset of SYNTHETIC-1, a collaboratively generated reasoning dataset from Deepseek-R1. The model largely outperforms other models based on Qwen-2.5-Instruct-7B that were trained with smaller reasoning datasets.
All SYNTHETIC-1 datasets can be found in our 🤗 SYNTHETIC-1 Collection.
Feel free to cite SYNTHETIC-1 if you have found it useful for your work
@misc{2025synthetic1,
title={SYNTHETIC-1: Two Million Collaboratively Generated Reasoning Traces from Deepseek-R1},
author={Justus Mattern and Sami Jaghouar and Manveer Basra and Jannik Straube and Matthew Di Ferrante and Felix Gabriel and Jack Min Ong and Vincent Weisser and Johannes Hagemann},
year={2025},
url={https://www.primeintellect.ai/blog/synthetic-1-release},
}