ubergarm commited on
Commit
beec9f1
·
1 Parent(s): 0b92986

Add IQ2_KS

Browse files
Files changed (1) hide show
  1. README.md +51 -5
README.md CHANGED
@@ -19,8 +19,12 @@ tags:
19
  - [x] cast fp8 safetensors to bf16 safetensors
20
  - [x] convert to bf16 GGUF
21
  - [x] quantize Q8_0 without imatrix
22
- - [ ] calculate and upload imatrix from Q8_0
23
- - [ ] begin quantizing and releasing
 
 
 
 
24
 
25
  Open a discussion if you have a specific target RAM+VRAM in mind for your rig and I'll see what I can do given the available quants. Cheers!
26
 
@@ -85,15 +89,57 @@ echo TODO
85
 
86
  </details>
87
 
88
- ### `IQ2_KS` TODO
89
- Final estimate: PPL = TODO
90
 
91
  <details>
92
 
93
  <summary>👈 Secret Recipe</summary>
94
 
95
  ```bash
96
- echo TODO
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
97
  ```
98
 
99
  </details>
 
19
  - [x] cast fp8 safetensors to bf16 safetensors
20
  - [x] convert to bf16 GGUF
21
  - [x] quantize Q8_0 without imatrix
22
+ - [x] calculate and upload imatrix from Q8_0 (note imatrix is missing data for a few tensors: https://huggingface.co/ubergarm/Kimi-K2-Instruct-0905-GGUF/discussions/1#68bc58de31fa67452e075b9f )
23
+ - [x] begin quantizing and releasing
24
+ - [x] IQ2_KS
25
+ - [ ] smol-IQ4_KSS
26
+ - [ ] smol-IQ2_KL
27
+ - [ ] etc...
28
 
29
  Open a discussion if you have a specific target RAM+VRAM in mind for your rig and I'll see what I can do given the available quants. Cheers!
30
 
 
89
 
90
  </details>
91
 
92
+ ### `IQ2_KS` 289.820 GiB (2.425 BPW)
93
+ Final estimate: PPL = 3.2478 +/- 0.01721
94
 
95
  <details>
96
 
97
  <summary>👈 Secret Recipe</summary>
98
 
99
  ```bash
100
+ #!/usr/bin/env bash
101
+
102
+ custom="
103
+ ## Attention [0-60] (GPU)
104
+ blk\..*\.attn_k_b\.weight=q8_0
105
+ blk\..*\.attn_v_b\.weight=q8_0
106
+
107
+ # Balance of attn tensors
108
+ blk\..*\.attn_kv_a_mqa\.weight=q8_0
109
+ blk\..*\.attn_q_a\.weight=q8_0
110
+ blk\..*\.attn_q_b\.weight=q8_0
111
+ blk\..*\.attn_output\.weight=q8_0
112
+
113
+ ## First Single Dense Layer [0] (GPU)
114
+ blk\..*\.ffn_down\.weight=q8_0
115
+ blk\..*\.ffn_(gate|up)\.weight=q8_0
116
+
117
+ ## Shared Expert [1-60] (GPU)
118
+ blk\..*\.ffn_down_shexp\.weight=q8_0
119
+ blk\..*\.ffn_(gate|up)_shexp\.weight=q8_0
120
+
121
+ ## Routed Experts [1-60] (CPU)
122
+ blk\..*\.ffn_down_exps\.weight=iq2_kl
123
+ blk\..*\.ffn_(gate|up)_exps\.weight=iq2_ks
124
+
125
+ ## Token embedding and output tensors (GPU)
126
+ token_embd\.weight=iq4_k
127
+ output\.weight=iq6_k
128
+ "
129
+
130
+ custom=$(
131
+ echo "$custom" | grep -v '^#' | \
132
+ sed -Ez 's:\n+:,:g;s:,$::;s:^,::'
133
+ )
134
+
135
+ numactl -N 1 -m 1 \
136
+ ./build/bin/llama-quantize \
137
+ --custom-q "$custom" \
138
+ --imatrix /mnt/data/models/ubergarm/Kimi-K2-Instruct-0905-GGUF/imatrix-Kimi-K2-Instruct-0905-Q8_0.dat \
139
+ /mnt/data/models/ubergarm/Kimi-K2-Instruct-0905-GGUF/Kimi-K2-384x14B-Instruct-safetensors-0905-BF16-00001-of-00046.gguf \
140
+ /mnt/data/models/ubergarm/Kimi-K2-Instruct-0905-GGUF/Kimi-K2-Instruct-0905-IQ2_KS.gguf \
141
+ IQ2_KS \
142
+ 192
143
  ```
144
 
145
  </details>