|
--- |
|
base_model: |
|
- Pinkstack/Superthoughts-lite-v1 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- gguf |
|
- cot |
|
- superthoughts |
|
- reasoning |
|
- grpo |
|
license: apache-2.0 |
|
language: |
|
- en |
|
datasets: |
|
- openai/gsm8k |
|
- Pinkstack/intructions-sft-sharegpt |
|
--- |
|
|
|
 |
|
|
|
# Information |
|
Advanced, high-quality and lite reasoning for a tiny size that you can run on your phone. |
|
|
|
Trained similarly to Deepseek R1, we used Smollm2 as a base model, then we've SFT fine tuned in on reasoning & modified the tokenizer slightly, after the SFT fine tuning we used GRPO to further amplify it's mathematics & problem solving abilities. |
|
|
|
# Which quant is right for you? |
|
|
|
***F16***: Least hallucinations, high-quality reasoning yet heavy to run. |
|
***Q8_0***: Limited amount of hallucinations high-quality reasoning, recommended |
|
***Q6_k***: Hallucinates more, good reasoning but may fail at counting etc. only use if you cannot run Q8_0. |
|
***Q4_k_m***: Not recommended, Hallucinates, doesn't always think properly. easier to run though. |
|
|
|
# Format |
|
``` |
|
<|im_start|>user |
|
How many R's in strawberry<|im_end|> |
|
<|im_start|>assistant |
|
<think> |
|
Alright, the user has asked how many R's in the word strawberry, that's easy! I just need to count each instance of the letter 'R' in the word 's-t-r-a-w-b-e-r-r-y' and then find out how many R's there are, lets count! |
|
S - Not an R, |
|
T - Not an R, |
|
R - First instance of the letter R! (1), |
|
A - Not an R, |
|
W - Not an R, |
|
B - Not an R, |
|
E - Not an R, |
|
R - Great! Second instance of the letter R. (2), |
|
R - Third instance of the letter R. (3), |
|
Y - Not an R. |
|
|
|
So, i've counted all the letters correctly, meaning that I am sure that there are 3 R's in the word Strawberry. I should probably let the user know. |
|
</think> |
|
<output>3 |
|
</output><|im_end|> |
|
``` |
|
# system prompt |
|
(important to ensure it would always think, output). |
|
``` |
|
respond in the following format: |
|
<think> |
|
... |
|
</think> |
|
<output> |
|
... |
|
</output> |
|
``` |
|
# Examples: |
|
all responses below generated with our system prompt and a temperature of 0.7. |
|
Generated inside the android application, ChatterUI via GGUF Q8, using the model's prompt format. and our |
|
1) |
|
 |
|
2) |
|
 |
|
3) |
|
 |
|
|
|
# Uploaded model |
|
|
|
- **Developed by:** Pinkstack |
|
- **License:** apache-2.0 |
|
- **Finetuned from model :** HuggingFaceTB/SmolLM2-1.7B-Instruct |