ubergarm
/

Qwen3-Coder-480B-A35B-Instruct-GGUF

ubergarm commited on Jul 23

Commit

7972ce1

1 Parent(s): a4cc885

uploading more and publishing Perplexities

Files changed (1) hide show

README.md CHANGED Viewed

@@ -12,8 +12,6 @@ tags:
 - ik_llama.cpp
 ---
-*WIP* Still cooking and will upload ASAP.
 ## `ik_llama.cpp` imatrix Quantizations of Qwen/Qwen3-Coder-480B-A35B-Instruct
 This quant collection **REQUIRES** [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork to support the ik's latest SOTA quants and optimizations! Do **not** download these big files and expect them to run on mainline vanilla llama.cpp, ollama, LM Studio, KoboldCpp, etc!
@@ -36,8 +34,8 @@ Perplexity computed against *wiki.test.raw*. These first two are just test quant
 * `Q8_0` 475.297 GiB (8.503 BPW)
   - Final estimate: PPL = TODO
-## `IQ5_K` TODO
-Final estimate: TODO
 <details>
@@ -81,7 +79,7 @@ numactl -N 0 -m 0 \
 </details>
-## `IQ4_K` TODO
 Final estimate: TODO
 <details>
@@ -126,8 +124,8 @@ numactl -N 0 -m 0 \
 </details>
-## `IQ3_K` TODO
-Final estimate: TODO
 <details>
@@ -261,8 +259,8 @@ numactl -N 0 -m 0 \
 </details>
-## `IQ2_KS` TODO
-Final estimate: TODO
 <details>

 - ik_llama.cpp
 ---
 ## `ik_llama.cpp` imatrix Quantizations of Qwen/Qwen3-Coder-480B-A35B-Instruct
 This quant collection **REQUIRES** [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork to support the ik's latest SOTA quants and optimizations! Do **not** download these big files and expect them to run on mainline vanilla llama.cpp, ollama, LM Studio, KoboldCpp, etc!
 * `Q8_0` 475.297 GiB (8.503 BPW)
   - Final estimate: PPL = TODO
+## `IQ5_K` 329.804 GiB (5.900 BPW)
+Final estimate: PPL = 5.1073 +/- 0.03268
 <details>
 </details>
+## `IQ4_K` 273.041 GiB (4.885 BPW)
 Final estimate: TODO
 <details>
 </details>
+## `IQ3_K` 216.047 GiB (3.865 BPW)
+Final estimate: PPL = 5.1808 +/- 0.03319
 <details>
 </details>
+## `IQ2_KS` 144.126 GiB (2.578 BPW)
+Final estimate: PPL = 5.6658 +/- 0.03716
 <details>