Where's the GGUF version gone?

by nulled - opened Oct 31, 2023

Discussion

nulled

Oct 31, 2023

I think there was a GGUF version but can't find anymore. Is something wrong with that version?

TheBloke

Owner Oct 31, 2023

Yes it was broken, unusable. I've not yet figured out how to make a working version. I'll need to raise it to the llama.cpp team but haven't had time yet

nulled

Nov 1, 2023

Any luck yet?

Ransom

Nov 5, 2023

Someone from Reddit has posted a quant that works with llama.cpp here: https://huggingface.co/imi2/airoboros-180b-2.2.1-gguf
Just make sure you're running the latest version of llama.cpp and follow the instructions for merging the files.
Here's the command I use to run it:
./server --model models/airoboros-180b-2.2.1-Q5_K_M.gguf --n-gpu-layers 128 --ctx-size 4090 --port 5005 --host 0.0.0.0 --parallel 1 --cont-batching --threads 24

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment