Mungert
/

GLM-Z1-32B-0414-GGUF

@@ -9,19 +9,12 @@ library_name: transformers
 # <span style="color: #7FFF7F;">GLM-Z1-32B-0414 GGUF Models</span>
-> **⚠️ Important Note:**
->
-> When using **llama.cpp**, you may experience **repeat text after 64 tokens**.
->
-> Add this option to resolve it:
-> `--override-kv glm4.rope.dimension_count=int:64`
->
-> **Example usage:**
-> ```bash
-> ./llama-cli -m GLM-Z1-9B-0414-iq3_m.gguf -cnv --override-kv glm4.rope.dimension_count=int:64
-> ```
->
-> Source: [llama.cpp GitHub Issue #12946](https://github.com/ggml-org/llama.cpp/issues/12946)
 ## <span style="color: #7FFF7F;">Ultra-Low-Bit Quantization with IQ-DynamicGate (1-2 bit)</span>
@@ -78,6 +71,7 @@ All tests conducted on **Llama-3-8B-Instruct** using:
 ✔ **Research** into ultra-low-bit quantization
 ## **Choosing the Right Model Format**
 Selecting the correct model format depends on your **hardware capabilities** and **memory constraints**.

 # <span style="color: #7FFF7F;">GLM-Z1-32B-0414 GGUF Models</span>
+## <span style="color: #7F7FFF;">Model Generation Details</span>
+This model was generated using [llama.cpp](https://github.com/ggerganov/llama.cpp) at commit [`e291450`](https://github.com/ggerganov/llama.cpp/commit/e291450b7602d7a36239e4ceeece37625f838373).
 ## <span style="color: #7FFF7F;">Ultra-Low-Bit Quantization with IQ-DynamicGate (1-2 bit)</span>
 ✔ **Research** into ultra-low-bit quantization
 ## **Choosing the Right Model Format**
 Selecting the correct model format depends on your **hardware capabilities** and **memory constraints**.