https://huggingface.co/jondurbin/bagel-dpo-20b-v04-llama
jondurbin/bagel-dpo-20b-v04-llama and jondurbin/bagel-20b-v04-llama
need iMatrix quants. Because the quant from dranger003 "dranger003/bagel-dpo-20b-v04-llama-iMat.GGUF" may occasionally generate the wrong token and give an end of turn type of token at an inopportune time.(use lm-studio version from 0.2.18 to 0.2.23, reproducible)
And the quant for bagel-dpo-20b-v04 (the unllamafied version) can't be correctly loaded in lm-studio.
Not sure another quant will change anything about these issues,. but sure, they are in the queue. If nothing goes wrong, they should be there within a few hours. Next time, it would help to give URLs for the models.
Unfortunately, llama.cpp crashes when trying to generate the imatrix, and this looks like a bug in the model, so no imatrix quants:
GGML_ASSERT: llama.cpp/llama.cpp:4530: unicode_cpts_from_utf8(word).size() > 0
This might affect the static ones as well.
Yup, affects the static quants as well. I could try with the old convert.py script, but that is unlikely to yield different results than dranger003 has, other than a different imatrix. The model probably has some tokenizer issue, and we have the choice between the model not loading or having subtle bugs.