Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -9,19 +9,12 @@ library_name: transformers
|
|
9 |
|
10 |
# <span style="color: #7FFF7F;">GLM-Z1-32B-0414 GGUF Models</span>
|
11 |
|
12 |
-
|
13 |
-
>
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
>
|
19 |
-
> **Example usage:**
|
20 |
-
> ```bash
|
21 |
-
> ./llama-cli -m GLM-Z1-9B-0414-iq3_m.gguf -cnv --override-kv glm4.rope.dimension_count=int:64
|
22 |
-
> ```
|
23 |
-
>
|
24 |
-
> Source: [llama.cpp GitHub Issue #12946](https://github.com/ggml-org/llama.cpp/issues/12946)
|
25 |
|
26 |
|
27 |
## <span style="color: #7FFF7F;">Ultra-Low-Bit Quantization with IQ-DynamicGate (1-2 bit)</span>
|
@@ -78,6 +71,7 @@ All tests conducted on **Llama-3-8B-Instruct** using:
|
|
78 |
✔ **Research** into ultra-low-bit quantization
|
79 |
|
80 |
|
|
|
81 |
## **Choosing the Right Model Format**
|
82 |
|
83 |
Selecting the correct model format depends on your **hardware capabilities** and **memory constraints**.
|
|
|
9 |
|
10 |
# <span style="color: #7FFF7F;">GLM-Z1-32B-0414 GGUF Models</span>
|
11 |
|
12 |
+
|
13 |
+
## <span style="color: #7F7FFF;">Model Generation Details</span>
|
14 |
+
|
15 |
+
This model was generated using [llama.cpp](https://github.com/ggerganov/llama.cpp) at commit [`e291450`](https://github.com/ggerganov/llama.cpp/commit/e291450b7602d7a36239e4ceeece37625f838373).
|
16 |
+
|
17 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
|
20 |
## <span style="color: #7FFF7F;">Ultra-Low-Bit Quantization with IQ-DynamicGate (1-2 bit)</span>
|
|
|
71 |
✔ **Research** into ultra-low-bit quantization
|
72 |
|
73 |
|
74 |
+
|
75 |
## **Choosing the Right Model Format**
|
76 |
|
77 |
Selecting the correct model format depends on your **hardware capabilities** and **memory constraints**.
|