Transformers
GGUF
imatrix
conversational
leonardlin commited on
Commit
21e2c40
·
verified ·
1 Parent(s): bb2f9e1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -17,7 +17,7 @@ quantized_by: leonardlin
17
  ## About
18
  This repo contains select GGUF quants of [shisa-ai/shisa-v2-llama3.1-405b](https://huggingface.co/shisa-ai/shisa-v2-llama3.1-405b)
19
  - All quants were created with `b5503` of upstream [llama.cpp](https://github.com/ggerganov/llama.cpp)
20
- - All quants are weighted/imatrix quants created from our [shisa-ai/shisa-v2-sharegpt](https://huggingface.co/datasets/shisa-ai/shisa-v2-sharegpt) bilingual dataset on the fp16 model
21
  - Files are pre-split at 45GB (below HF's 50GB upload limit). Modern llama.cpp builds should be able to load the sequential files automatically, but you can use `llama-gguf-split --merge` if you want to merge them back together
22
 
23
  ## Provided Quants
 
17
  ## About
18
  This repo contains select GGUF quants of [shisa-ai/shisa-v2-llama3.1-405b](https://huggingface.co/shisa-ai/shisa-v2-llama3.1-405b)
19
  - All quants were created with `b5503` of upstream [llama.cpp](https://github.com/ggerganov/llama.cpp)
20
+ - All quants are weighted/imatrix quants created from our [shisa-ai/shisa-v2-sharegpt](https://huggingface.co/datasets/shisa-ai/shisa-v2-sharegpt) bilingual dataset on the fp16 model except for the Q8_0
21
  - Files are pre-split at 45GB (below HF's 50GB upload limit). Modern llama.cpp builds should be able to load the sequential files automatically, but you can use `llama-gguf-split --merge` if you want to merge them back together
22
 
23
  ## Provided Quants