aashish1904 commited on
Commit
7fe30f4
·
verified ·
1 Parent(s): e28df9c

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ library_name: transformers
5
+ tags:
6
+ - trl
7
+ - sft
8
+ base_model:
9
+ - meta-llama/Llama-3.2-1B-Instruct
10
+ datasets:
11
+ - ngxson/MiniThinky-dataset
12
+
13
+ ---
14
+
15
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
16
+
17
+
18
+ # QuantFactory/MiniThinky-v2-1B-Llama-3.2-GGUF
19
+ This is quantized version of [ngxson/MiniThinky-v2-1B-Llama-3.2](https://huggingface.co/ngxson/MiniThinky-v2-1B-Llama-3.2) created using llama.cpp
20
+
21
+ # Original Model Card
22
+
23
+
24
+ # MiniThinky 1B
25
+
26
+ This is the newer checkpoint of [MiniThinky-1B-Llama-3.2 (version 1)](https://huggingface.co/ngxson/MiniThinky-1B-Llama-3.2), which the loss decreased from 0.7 to 0.5
27
+
28
+ Link to GGUF version: [click here](https://huggingface.co/ngxson/MiniThinky-v2-1B-Llama-3.2-Q8_0-GGUF)
29
+
30
+ Chat template is the same with llama 3, but the response will be as follow:
31
+
32
+ ```
33
+ <|thinking|>{thinking_process}
34
+ <|answer|>
35
+ {real_answer}
36
+ ```
37
+
38
+ ## IMPORTANT: System message
39
+
40
+ The model is **very sensitive** to system message. Make sure you're using this system message (system role) at the beginning of the conversation:
41
+
42
+ `You are MiniThinky, a helpful AI assistant. You always think before giving the answer. Use <|thinking|> before thinking and <|answer|> before giving the answer.`
43
+
44
+ ## Q&A
45
+
46
+ **Hardware used to trained it?**
47
+ I used a HF space with 4xL40S, trained for 5 hours. Eval loss is about 0.8
48
+
49
+ **Benchmark?**
50
+ I don't have time to do it alone. If you can help, please open a discussion!
51
+
52
+ **Can it count number of "r" in "raspberry"?**
53
+ Unfortunately no
54
+
55
+ **Other things that I can tune?**
56
+ Maybe lower temperature, or set top_k=1
57
+
58
+ ---
59
+
60
+ TODO: include more info here + maybe do some benchmarks? (Plz add a discussion if you're interested)
61
+