Spaces:
Running
Running
SmolLm2-135 R1 Distill
#5
by
ewre324
- opened
Hello, I just used SFT to produce an R1 distill.
https://huggingface.co/ewre324/ewre324-R1-SmolLM2-135M-Distill
Please use and comment if possible.
i think the downside of thinking models is that even for simple question they may take alot of thinking tokens but i think we should have dataset to Train llms to figure out when to use thinking strategy and when to simply answer the question like regular llms do