Lewdiculous
commited on
Commit
•
dbe1e62
1
Parent(s):
00935fb
Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ My GGUF-IQ-Imatrix quants for [**Nitral-AI/Hathor-L3-8B-v.01**](https://huggingf
|
|
17 |
|
18 |
> [!NOTE]
|
19 |
> **General usage:** <br>
|
20 |
-
> Use the latest version of **
|
21 |
> Remember that you can also use `--flashattention` on KoboldCpp now even with non-RTX cards for reduced VRAM usage. <br>
|
22 |
> For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes. <br>
|
23 |
> For **12GB VRAM** GPUs, the **Q5_K_M-imat** quant will give you a great size/quality balance. <br>
|
|
|
17 |
|
18 |
> [!NOTE]
|
19 |
> **General usage:** <br>
|
20 |
+
> Use the [**latest version of KoboldCpp**](https://github.com/LostRuins/koboldcpp/releases/latest). <br>
|
21 |
> Remember that you can also use `--flashattention` on KoboldCpp now even with non-RTX cards for reduced VRAM usage. <br>
|
22 |
> For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes. <br>
|
23 |
> For **12GB VRAM** GPUs, the **Q5_K_M-imat** quant will give you a great size/quality balance. <br>
|