Qwen3-30B-A3B-exl3 / README.md
turboderp's picture
Update README.md
79b0a60 verified
metadata
license: apache-2.0

EXL3 quants of Qwen3-30B-A3B

2.25 bits per weight
3.00 bits per weight
4.00 bits per weight
5.00 bits per weight
6.00 bits per weight

While I work out a way to meaningfully measure perplexity for such a sparse model, here are some other tests:

Model HumanEval pass@1 KL-div vs FP16 (wiki2 20k tokens) Top-1 agreement vs FP16
2.25 bpw 88.41% 0.1416 84.78%
3.00 bpw 89.63% 0.0688 89.44%
4.00 bpw 92.07% 0.0215 94.33%
5.00 bpw 93.29% 0.0094 96.24%
6.00 bpw 92.68% 0.0054 97.45%
FP16 91.46% - -