Update README.md
Browse files
README.md
CHANGED
@@ -3,8 +3,8 @@
|
|
3 |
license: apache-2.0
|
4 |
---
|
5 |
|
6 |
-
This repository contains 3 versions of Qwen3-30B-A3B quantized with `IQ4_KS
|
7 |
-
`wiki.test.raw` than the original `bf16` model. This is
|
8 |
in the [Qwen3 announcement](https://qwenlm.github.io/blog/qwen3/). Hence I'm putting them out there for anyone interested in evaluating performance by means other than PPL,
|
9 |
or just using for local inferrence.
|
10 |
For more details see [this discussion](https://github.com/ikawrakow/ik_llama.cpp/discussions/359).
|
|
|
3 |
license: apache-2.0
|
4 |
---
|
5 |
|
6 |
+
This repository contains 3 versions of Qwen3-30B-A3B quantized with `IQ4_KS` (4.25 bpw quantization). The interesting part is that these models achieve a lower peplexity on
|
7 |
+
`wiki.test.raw` than the original `bf16` model. This is surprising, considering that no QAT has been mentioned
|
8 |
in the [Qwen3 announcement](https://qwenlm.github.io/blog/qwen3/). Hence I'm putting them out there for anyone interested in evaluating performance by means other than PPL,
|
9 |
or just using for local inferrence.
|
10 |
For more details see [this discussion](https://github.com/ikawrakow/ik_llama.cpp/discussions/359).
|