turboderp commited on
Commit
79b0a60
·
verified ·
1 Parent(s): 3165539

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -3
README.md CHANGED
@@ -1,3 +1,22 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ EXL3 quants of [Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B)
6
+
7
+ [2.25 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/2.25bpw)
8
+ [3.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/3.0bpw)
9
+ [4.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/4.0bpw)
10
+ [5.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/5.0bpw)
11
+ [6.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/6.0bpw)
12
+
13
+ While I work out a way to meaningfully measure perplexity for such a sparse model, here are some other tests:
14
+
15
+ | Model | HumanEval pass@1 | KL-div vs FP16 (wiki2 20k tokens) | Top-1 agreement vs FP16 |
16
+ |----------|------------------|-----------------------------------|-------------------------|
17
+ | 2.25 bpw | 88.41% | 0.1416 | 84.78% |
18
+ | 3.00 bpw | 89.63% | 0.0688 | 89.44% |
19
+ | 4.00 bpw | 92.07% | 0.0215 | 94.33% |
20
+ | 5.00 bpw | 93.29% | 0.0094 | 96.24% |
21
+ | 6.00 bpw | 92.68% | 0.0054 | 97.45% |
22
+ | FP16 | 91.46% | - | - |