ewre324
/

ewre324-R1-SmolLM2-135M-Distill

Model card Files Files and versions Metrics Training metrics Community

ewre324 commited on 13 days ago

Commit

3592b13

·

verified ·

1 Parent(s): ef1d886

Update README.md

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -6,4 +6,10 @@ base_model:
 ---
 Used Open R1 (by Huggingface)  to SFT my earlier thinker models. Encouraging results.
-Checkpoints also present.

 ---
 Used Open R1 (by Huggingface)  to SFT my earlier thinker models. Encouraging results.
+Checkpoints also present.
+https://github.com/ewre324/open-r1/tree/main
+Based on DeepSeek R1 based method to train on specific reasoning dataset to ensure more thinking.
+Still the <think> ... </think> tags are not generated. TODO.