ubergarm commited on
Commit
7972ce1
·
1 Parent(s): a4cc885

uploading more and publishing Perplexities

Browse files
Files changed (1) hide show
  1. README.md +7 -9
README.md CHANGED
@@ -12,8 +12,6 @@ tags:
12
  - ik_llama.cpp
13
  ---
14
 
15
- *WIP* Still cooking and will upload ASAP.
16
-
17
  ## `ik_llama.cpp` imatrix Quantizations of Qwen/Qwen3-Coder-480B-A35B-Instruct
18
  This quant collection **REQUIRES** [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork to support the ik's latest SOTA quants and optimizations! Do **not** download these big files and expect them to run on mainline vanilla llama.cpp, ollama, LM Studio, KoboldCpp, etc!
19
 
@@ -36,8 +34,8 @@ Perplexity computed against *wiki.test.raw*. These first two are just test quant
36
  * `Q8_0` 475.297 GiB (8.503 BPW)
37
  - Final estimate: PPL = TODO
38
 
39
- ## `IQ5_K` TODO
40
- Final estimate: TODO
41
 
42
  <details>
43
 
@@ -81,7 +79,7 @@ numactl -N 0 -m 0 \
81
 
82
  </details>
83
 
84
- ## `IQ4_K` TODO
85
  Final estimate: TODO
86
 
87
  <details>
@@ -126,8 +124,8 @@ numactl -N 0 -m 0 \
126
 
127
  </details>
128
 
129
- ## `IQ3_K` TODO
130
- Final estimate: TODO
131
 
132
  <details>
133
 
@@ -261,8 +259,8 @@ numactl -N 0 -m 0 \
261
 
262
  </details>
263
 
264
- ## `IQ2_KS` TODO
265
- Final estimate: TODO
266
 
267
  <details>
268
 
 
12
  - ik_llama.cpp
13
  ---
14
 
 
 
15
  ## `ik_llama.cpp` imatrix Quantizations of Qwen/Qwen3-Coder-480B-A35B-Instruct
16
  This quant collection **REQUIRES** [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork to support the ik's latest SOTA quants and optimizations! Do **not** download these big files and expect them to run on mainline vanilla llama.cpp, ollama, LM Studio, KoboldCpp, etc!
17
 
 
34
  * `Q8_0` 475.297 GiB (8.503 BPW)
35
  - Final estimate: PPL = TODO
36
 
37
+ ## `IQ5_K` 329.804 GiB (5.900 BPW)
38
+ Final estimate: PPL = 5.1073 +/- 0.03268
39
 
40
  <details>
41
 
 
79
 
80
  </details>
81
 
82
+ ## `IQ4_K` 273.041 GiB (4.885 BPW)
83
  Final estimate: TODO
84
 
85
  <details>
 
124
 
125
  </details>
126
 
127
+ ## `IQ3_K` 216.047 GiB (3.865 BPW)
128
+ Final estimate: PPL = 5.1808 +/- 0.03319
129
 
130
  <details>
131
 
 
259
 
260
  </details>
261
 
262
+ ## `IQ2_KS` 144.126 GiB (2.578 BPW)
263
+ Final estimate: PPL = 5.6658 +/- 0.03716
264
 
265
  <details>
266