Using Oogabooga and a 4090 (windows install) - which model of the two available do I use?
#1
by
cleverest
- opened
Oh whoops, there weren't meant to be two files there. I uploaded the 4bit-128g file by mistake, then uploaded the 4bit no groupsize file but forgot to delete the formet.
If you're using ExLlama, you can use the 4bit-128g file, which has higher accuracy. It'll be too slow to use with AutoGPTQ or GPTQ-for-LLaMa (which is why I don't provide it normally), but ExLlama can run it at the same speed.
In future I plan to provide multiple GPTQ versions for each repo, to give people a choice. Though they will be in separate branches, not in the same folder.