turboderp
/

Qwen3-30B-A3B-exl3

Model card Files Files and versions Community

turboderp commited on 3 days ago

Commit

79b0a60

·

verified ·

1 Parent(s): 3165539

Update README.md

Files changed (1) hide show

README.md +22 -3

README.md CHANGED Viewed

@@ -1,3 +1,22 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+---
+EXL3 quants of [Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B)
+[2.25 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/2.25bpw)
+[3.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/3.0bpw)
+[4.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/4.0bpw)
+[5.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/5.0bpw)
+[6.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/6.0bpw)
+While I work out a way to meaningfully measure perplexity for such a sparse model, here are some other tests:
+| Model    | HumanEval pass@1 | KL-div vs FP16 (wiki2 20k tokens) | Top-1 agreement vs FP16 |
+|----------|------------------|-----------------------------------|-------------------------|
+| 2.25 bpw | 88.41%           | 0.1416                            | 84.78%                  |
+| 3.00 bpw | 89.63%           | 0.0688                            | 89.44%                  |
+| 4.00 bpw | 92.07%           | 0.0215                            | 94.33%                  |
+| 5.00 bpw | 93.29%           | 0.0094                            | 96.24%                  |
+| 6.00 bpw | 92.68%           | 0.0054                            | 97.45%                  |
+| FP16     | 91.46%           | -                                 | -                       |