Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ This model was generated using [llama.cpp](https://github.com/ggerganov/llama.cp
|
|
20 |
|
21 |
## <span style="color: #7FFF7F;"> Quantization beyond the IMatrix</span>
|
22 |
|
23 |
-
|
24 |
|
25 |
I have found that the standard IMatrix does not perform very well at low bit quantiztion and for MOE models. So I am using llama.cpp --tensor-type to bump up selected layers. See [Layer bumping with llama.cpp](https://github.com/Mungert69/GGUFModelBuilder/blob/main/model-converter/tensor_list_builder.py)
|
26 |
|
|
|
20 |
|
21 |
## <span style="color: #7FFF7F;"> Quantization beyond the IMatrix</span>
|
22 |
|
23 |
+
Testing a new quantization method using rules to bump important layers above what the standard imatrix would use.
|
24 |
|
25 |
I have found that the standard IMatrix does not perform very well at low bit quantiztion and for MOE models. So I am using llama.cpp --tensor-type to bump up selected layers. See [Layer bumping with llama.cpp](https://github.com/Mungert69/GGUFModelBuilder/blob/main/model-converter/tensor_list_builder.py)
|
26 |
|