Lewdiculous
/

Hathor-L3-8B-v.01-GGUF-IQ-Imatrix

Inference Endpoints

Model card Files Files and versions Community

Lewdiculous commited on Jun 7

Commit

dbe1e62

•

1 Parent(s): 00935fb

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ My GGUF-IQ-Imatrix quants for [**Nitral-AI/Hathor-L3-8B-v.01**](https://huggingf
 > [!NOTE]
 > **General usage:** <br>
-> Use the latest version of **KoboldCpp**. <br>
 > Remember that you can also use `--flashattention` on KoboldCpp now even with non-RTX cards for reduced VRAM usage. <br>
 > For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes. <br>
 > For **12GB VRAM** GPUs, the **Q5_K_M-imat** quant will give you a great size/quality balance. <br>

 > [!NOTE]
 > **General usage:** <br>
+> Use the [**latest version of KoboldCpp**](https://github.com/LostRuins/koboldcpp/releases/latest). <br>
 > Remember that you can also use `--flashattention` on KoboldCpp now even with non-RTX cards for reduced VRAM usage. <br>
 > For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes. <br>
 > For **12GB VRAM** GPUs, the **Q5_K_M-imat** quant will give you a great size/quality balance. <br>