GeorgyGUF commited on
Commit
b0aceae
·
verified ·
1 Parent(s): ab1d77c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -195,7 +195,7 @@ license_name: llama4
195
  pipeline_tag: image-text-to-text
196
  ---
197
 
198
- Currently text only is supported. Created with llama.cpp b5074: `python llama.cpp/convert_hf_to_gguf.py --outfile Llama-4-Maverick-17B-128E-Instruct-bf16.gguf --outtype bf16 models--unsloth--Llama-4-Maverick-17B-128E-Instruct/snapshots/4d0b9b85d7b4c203d8354c4b645021d1985032c1 --use-temp-file`. I did this to be able to create proper quantisation by running this command command: `llama-quantize --leave-output-tensor --token-embedding-type BF16 Llama-4-Maverick-17B-128E-Instruct-bf16.gguf Llama-4-Maverick-17B-128E-Instruct-q8-with-bf16-embedding-and-bf16-output.gguf Q8_0`. You can check my quant here: https://huggingface.co/GeorgyGUF/Llama-4-Maverick-17B-128E-Instruct-q8-with-bf16-embedding-and-bf16-output.gguf
199
 
200
  **Chat template/prompt format:**
201
  ```
 
195
  pipeline_tag: image-text-to-text
196
  ---
197
 
198
+ Currently text only is supported. Created with llama.cpp b5074: `python llama.cpp/convert_hf_to_gguf.py --outfile Llama-4-Maverick-17B-128E-Instruct-bf16.gguf --outtype bf16 models--unsloth--Llama-4-Maverick-17B-128E-Instruct/snapshots/4d0b9b85d7b4c203d8354c4b645021d1985032c1 --use-temp-file`. I did this to be able to create proper quantisation by running this command: `llama-quantize --leave-output-tensor --token-embedding-type BF16 Llama-4-Maverick-17B-128E-Instruct-bf16.gguf Llama-4-Maverick-17B-128E-Instruct-q8-with-bf16-embedding-and-bf16-output.gguf Q8_0`. You can check my quant here: https://huggingface.co/GeorgyGUF/Llama-4-Maverick-17B-128E-Instruct-q8-with-bf16-embedding-and-bf16-output.gguf
199
 
200
  **Chat template/prompt format:**
201
  ```