Mungert commited on
Commit
037b2c6
·
verified ·
1 Parent(s): d974ae4

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +7 -13
README.md CHANGED
@@ -9,19 +9,12 @@ library_name: transformers
9
 
10
  # <span style="color: #7FFF7F;">GLM-Z1-32B-0414 GGUF Models</span>
11
 
12
- > **⚠️ Important Note:**
13
- >
14
- > When using **llama.cpp**, you may experience **repeat text after 64 tokens**.
15
- >
16
- > Add this option to resolve it:
17
- > `--override-kv glm4.rope.dimension_count=int:64`
18
- >
19
- > **Example usage:**
20
- > ```bash
21
- > ./llama-cli -m GLM-Z1-9B-0414-iq3_m.gguf -cnv --override-kv glm4.rope.dimension_count=int:64
22
- > ```
23
- >
24
- > Source: [llama.cpp GitHub Issue #12946](https://github.com/ggml-org/llama.cpp/issues/12946)
25
 
26
 
27
  ## <span style="color: #7FFF7F;">Ultra-Low-Bit Quantization with IQ-DynamicGate (1-2 bit)</span>
@@ -78,6 +71,7 @@ All tests conducted on **Llama-3-8B-Instruct** using:
78
  ✔ **Research** into ultra-low-bit quantization
79
 
80
 
 
81
  ## **Choosing the Right Model Format**
82
 
83
  Selecting the correct model format depends on your **hardware capabilities** and **memory constraints**.
 
9
 
10
  # <span style="color: #7FFF7F;">GLM-Z1-32B-0414 GGUF Models</span>
11
 
12
+
13
+ ## <span style="color: #7F7FFF;">Model Generation Details</span>
14
+
15
+ This model was generated using [llama.cpp](https://github.com/ggerganov/llama.cpp) at commit [`e291450`](https://github.com/ggerganov/llama.cpp/commit/e291450b7602d7a36239e4ceeece37625f838373).
16
+
17
+
 
 
 
 
 
 
 
18
 
19
 
20
  ## <span style="color: #7FFF7F;">Ultra-Low-Bit Quantization with IQ-DynamicGate (1-2 bit)</span>
 
71
  ✔ **Research** into ultra-low-bit quantization
72
 
73
 
74
+
75
  ## **Choosing the Right Model Format**
76
 
77
  Selecting the correct model format depends on your **hardware capabilities** and **memory constraints**.