File size: 5,225 Bytes
a7408f4 ce5803a a7408f4 380534f a7408f4 380534f 36fb743 380534f a7408f4 354197f a7408f4 36fb743 3fe6ae5 5cfbbdb d23d8e3 ca7b0c6 5cfbbdb 2183139 ca7b0c6 5cfbbdb 2183139 5cfbbdb 675a426 5cfbbdb 3869b88 3fe6ae5 2915f8d 5738717 2183139 ca7b0c6 ba46ee7 a7408f4 83543b1 354197f 83543b1 354197f e648d47 ce5803a ba46ee7 a7408f4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
---
base_model: GOAT-AI/GOAT-70B-Storytelling
language:
- en
library_name: transformers
license: llama2
model_type: llama
quantized_by: mradermacher
tags:
- facebook
- meta
- pytorch
- llama
- llama-2
- Storywriter
---
## About
weighted/imatrix quants of https://huggingface.co/GOAT-AI/GOAT-70B-Storytelling
<!-- provided-files -->
## Usage
If you are unsure how to use GGUF files, refer to one of [TheBloke's
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
more details, including on how to concatenate multi-part files.
## Provided Quants
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
| Link | Type | Size/GB | Notes |
|:-----|:-----|--------:|:------|
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ1_S.gguf) | i1-IQ1_S | 15.0 | for the desperate |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ1_M.gguf) | i1-IQ1_M | 16.0 | mostly desperate |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ2_XXS.gguf) | i1-IQ2_XXS | 18.7 | |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ2_XS.gguf) | i1-IQ2_XS | 20.8 | |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ2_S.gguf) | i1-IQ2_S | 21.8 | |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ2_M.gguf) | i1-IQ2_M | 23.7 | |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q2_K.gguf) | i1-Q2_K | 25.9 | IQ3_XXS probably better |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ3_XXS.gguf) | i1-IQ3_XXS | 27.4 | lower quality |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ3_XS.gguf) | i1-IQ3_XS | 28.6 | |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q3_K_XS.gguf) | i1-Q3_K_XS | 28.7 | |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ3_S.gguf) | i1-IQ3_S | 30.3 | beats Q3_K* |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q3_K_S.gguf) | i1-Q3_K_S | 30.3 | IQ3_XS probably better |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ3_M.gguf) | i1-IQ3_M | 31.4 | |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q3_K_M.gguf) | i1-Q3_K_M | 33.7 | IQ3_S probably better |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q3_K_L.gguf) | i1-Q3_K_L | 36.6 | IQ3_M probably better |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ4_XS.gguf) | i1-IQ4_XS | 37.2 | |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ4_NL.gguf) | i1-IQ4_NL | 39.4 | prefer IQ4_XS |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q4_0.gguf) | i1-Q4_0 | 39.4 | fast, low quality |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q4_K_S.gguf) | i1-Q4_K_S | 39.7 | optimal size/speed/quality |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q4_K_M.gguf) | i1-Q4_K_M | 41.8 | fast, recommended |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q5_K_S.gguf) | i1-Q5_K_S | 47.9 | |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q5_K_M.gguf) | i1-Q5_K_M | 49.2 | |
| [PART 1](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q6_K.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q6_K.gguf.part2of2) | i1-Q6_K | 57.0 | practically like static Q6_K |
Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):
![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)
And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9
## FAQ / Model Request
See https://huggingface.co/mradermacher/model_requests for some answers to
questions you might have and/or if you want some other model quantized.
## Thanks
I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
me use its servers and providing upgrades to my workstation to enable
this work in my free time.
<!-- end -->
|