Update README.md
Browse files
README.md
CHANGED
@@ -60,13 +60,13 @@ imatrix has no effect on this quant.
|
|
60 |
NEO dataset performance improvements will show the most in the IQ4_NL, followed by "MXFP4_MOE" quant.
|
61 |
|
62 |
IQ4_NL quant:
|
63 |
-
- OpenAI-120B-
|
64 |
|
65 |
-
NEO MXFP4_MOE
|
66 |
-
- OpenAI-120B-NEO-
|
67 |
|
68 |
MXFP4_MOE quants vastly outperform (at the moment) all other quants, except IQ4_NL, Q5_1 and Q8_0 due to odd
|
69 |
-
issues compressing OpenAI's 20B model due to odd "tensor" dimensions.
|
70 |
|
71 |
IQ4_NL, Q5_1 and Q8_0 quants are compatible with OpenAI's tensor structure.
|
72 |
|
|
|
60 |
NEO dataset performance improvements will show the most in the IQ4_NL, followed by "MXFP4_MOE" quant.
|
61 |
|
62 |
IQ4_NL quant:
|
63 |
+
- OpenAI-120B-NEO-IQ4_NL.gguf : Standard Imatrix + Output tensor IQ4_NL (NEO Imatrix) AND Embed at IQ4_NL.
|
64 |
|
65 |
+
NEO MXFP4_MOE quant:
|
66 |
+
- OpenAI-120B-NEO-MXFP4_MOE.gguf : Output tensor IQ4_NL (NEO Imatrix) AND Embed at IQ4_NL - this makes this quant the smallest version.
|
67 |
|
68 |
MXFP4_MOE quants vastly outperform (at the moment) all other quants, except IQ4_NL, Q5_1 and Q8_0 due to odd
|
69 |
+
issues compressing OpenAI's 20B model due to odd "tensor" dimensions (as of this writing).
|
70 |
|
71 |
IQ4_NL, Q5_1 and Q8_0 quants are compatible with OpenAI's tensor structure.
|
72 |
|