https://huggingface.co/hardlyworking/backmerge8b
https://huggingface.co/hardlyworking/backmerge8b
For some reason I haven't been able to get anywhere near reasonable speeds on imatrix generation for several months now, and I see that imatrix documentation has been removed from the llama.cpp repo... weird...
It's queued! :D
You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#backmerge8b-GGUF for quants to appear.
For some reason I haven't been able to get anywhere near reasonable speeds on imatrix generation for several months now
Make sure you compile llama.cpp using CUDA and use -ngl 0
. Without it using a GPU the imatirx computation performance will be unusable.
and I see that imatrix documentation has been removed from the llama.cpp repo... weird...
They just moved it from "examples" to "tools". You can find it here now: https://github.com/ggml-org/llama.cpp/blob/master/tools/imatrix/README.md