NewstaR/CoTton-6k
Viewer
•
Updated
•
6k
•
155
Light as cotton, deep as thought.
Note First CoTton dataset, only had 3 teachers.
Note Expanded CoTton-6k to 38k with a bias on Deepseek R1 and R1 0528 samples.
Note Meant to be an extension for CoTton-38k, samples from the a-m-team's R1 0528 dataset. This is different from CoTton because we erased the "metadata" and kept only human - assistant pairs with tags kept and tags removed. The method does not directly download the full dataset, which is what we did when sampling in CoTton-38k, but it uses snapshot download to get the specific files: others.jsonl, if.jsonl, and science.jsonl.