The model is broken
Okay, I fixed it.
Care to share what was wrong?
They probably hadn't followed the details in the README regarding which GPTQ-for-LLaMa code to use. If you use the safetensors
file you must use the latest GPTQ-for-LLaMa code for inference. If you don't want to update GPTQ, you can use the pt
file instead.
They probably hadn't followed the details in the README regarding which GPTQ-for-LLaMa code to use. If you use the
safetensors
file you must use the latest GPTQ-for-LLaMa code for inference. If you don't want to update GPTQ, you can use thept
file instead.
Yes it turned out that you need to completely remove GPTQ and compile it again
I'm using oobabooga using its dockerfile and I had to use .pt file to make it work
For me, it is strange. Because vicuna-13b-GPTQ-4bit-128g
works well which is safetensor as well, but the 1.1 does not.
git clone https://github.com/oobabooga/GPTQ-for-LLaMa.git -b cuda
cd GPTQ-for-LLaMa
python setup_cuda.py install
And I just tried to delete the folder and execute the above commands, it still does not work. Maybe I use a wrong readme?
For me, it is strange. Because
vicuna-13b-GPTQ-4bit-128g
works well which is safetensor as well, but the 1.1 does not.git clone https://github.com/oobabooga/GPTQ-for-LLaMa.git -b cuda cd GPTQ-for-LLaMa python setup_cuda.py install
And I just tried to delete the folder and execute the above commands, it still does not work. Maybe I use a wrong readme?
OK. Seems there is a guide in this model card's readme...