Update README.md
Browse files
README.md
CHANGED
@@ -195,7 +195,7 @@ license_name: llama4
|
|
195 |
pipeline_tag: image-text-to-text
|
196 |
---
|
197 |
|
198 |
-
Currently text only is supported. Created with llama.cpp b5074: `python llama.cpp/convert_hf_to_gguf.py --outfile Llama-4-Maverick-17B-128E-Instruct-bf16.gguf --outtype bf16 models--unsloth--Llama-4-Maverick-17B-128E-Instruct/snapshots/4d0b9b85d7b4c203d8354c4b645021d1985032c1 --use-temp-file`. I did this to be able to create proper quantisation by running this command
|
199 |
|
200 |
**Chat template/prompt format:**
|
201 |
```
|
|
|
195 |
pipeline_tag: image-text-to-text
|
196 |
---
|
197 |
|
198 |
+
Currently text only is supported. Created with llama.cpp b5074: `python llama.cpp/convert_hf_to_gguf.py --outfile Llama-4-Maverick-17B-128E-Instruct-bf16.gguf --outtype bf16 models--unsloth--Llama-4-Maverick-17B-128E-Instruct/snapshots/4d0b9b85d7b4c203d8354c4b645021d1985032c1 --use-temp-file`. I did this to be able to create proper quantisation by running this command: `llama-quantize --leave-output-tensor --token-embedding-type BF16 Llama-4-Maverick-17B-128E-Instruct-bf16.gguf Llama-4-Maverick-17B-128E-Instruct-q8-with-bf16-embedding-and-bf16-output.gguf Q8_0`. You can check my quant here: https://huggingface.co/GeorgyGUF/Llama-4-Maverick-17B-128E-Instruct-q8-with-bf16-embedding-and-bf16-output.gguf
|
199 |
|
200 |
**Chat template/prompt format:**
|
201 |
```
|