Update README.md
Browse files
README.md
CHANGED
@@ -22,4 +22,16 @@ except for f16 and q8_0, every quant is using the `merge.imatrix`
|
|
22 |
|
23 |
full wiki.train would have taken 10h
|
24 |
|
25 |
-
for more info on imatrix handling see https://github.com/ggerganov/llama.cpp/pull/5302
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
|
23 |
full wiki.train would have taken 10h
|
24 |
|
25 |
+
for more info on imatrix handling see https://github.com/ggerganov/llama.cpp/pull/5302
|
26 |
+
|
27 |
+
### ppl (512 wiki.test, 300chunks)
|
28 |
+
| quant | ppl (lower is better) |
|
29 |
+
|----------------|-----|
|
30 |
+
| f16(baseline) | xxx |
|
31 |
+
| q8_0 | xxx |
|
32 |
+
| q5_k_m | xxx |
|
33 |
+
| q4_k_m | xxx |
|
34 |
+
| iq3_xxs(merge) | 6.1984 +/- 0.05475 |
|
35 |
+
| q2_k | xxx |
|
36 |
+
| iq2_xs | xxx |
|
37 |
+
| iq2_xxs | xxx |
|