Pinkstack commited on
Commit
9b2e667
·
verified ·
1 Parent(s): b0c58af

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -35,7 +35,7 @@ Advanced, high-quality and **lite** reasoning for a tiny size that you can run o
35
 
36
  At original quality, it runs at ~400 tokens/second on a single H100 Nvidia GPU from Friendli.
37
 
38
- Trained similarly to Deepseek R1, we used Smollm2 as a base model, then we've SFT fine tuned on reasoning using our own private superthoughts instruct dataset, which includes a mix of code, website generation, day-to-day questions and answers, math. And then we modified the tokenizer slightly, after the SFT fine tuning we used GRPO to further amplify it's mathematics & problem solving abilities.
39
 
40
  # Format
41
  ```
@@ -60,6 +60,7 @@ So, i've counted all the letters correctly, meaning that I am sure that there ar
60
  <output>3
61
  </output><|im_end|>
62
  ```
 
63
  # system prompt
64
  (important to ensure it would always think, output).
65
  ```
 
35
 
36
  At original quality, it runs at ~400 tokens/second on a single H100 Nvidia GPU from Friendli.
37
 
38
+ Trained similarly to Deepseek R1, we used Smollm2 as a base model, then we've SFT fine tuned on reasoning using our own private superthoughts instruct dataset which includes a mix of code, website generation, day-to-day chats, math and counting problems. And then we modified the tokenizer slightly, after the SFT fine tuning we used GRPO to further amplify it's mathematics & problem solving abilities.
39
 
40
  # Format
41
  ```
 
60
  <output>3
61
  </output><|im_end|>
62
  ```
63
+ It is very reccomend to use a low temperature, higher temperatures may cause it to not think.
64
  # system prompt
65
  (important to ensure it would always think, output).
66
  ```