bartowski commited on
Commit
074329a
·
verified ·
1 Parent(s): eb8bc72

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -17,6 +17,12 @@ base_model: google/gemma-3-1b-it-qat-q4_0-unquantized
17
 
18
  ## Llamacpp imatrix Quantizations of gemma-3-1b-it-qat by google
19
 
 
 
 
 
 
 
20
  Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b5147">b5147</a> for quantization.
21
 
22
  Original model: https://huggingface.co/google/gemma-3-1b-it-qat-q4_0-unquantized
 
17
 
18
  ## Llamacpp imatrix Quantizations of gemma-3-1b-it-qat by google
19
 
20
+ These are derived from the QAT (quantized aware training) weights provided by Google
21
+
22
+ *ONLY* Q4_0 is expected to be better, but figured while I'm at it I might as well make others to see what happens?
23
+
24
+ [gemma-3-1b-it-qat-Q4_0.gguf](https://huggingface.co/bartowski/google_gemma-3-1b-it-qat-GGUF/blob/main/google_gemma-3-1b-it-qat-Q4_0.gguf) | Q4_0 | 0.72GB | Should be improved due to QAT, offers online repacking for ARM and AVX CPU inference.
25
+
26
  Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b5147">b5147</a> for quantization.
27
 
28
  Original model: https://huggingface.co/google/gemma-3-1b-it-qat-q4_0-unquantized