Quantized for the older GPTQ before it broke all the models.

Use with

https://github.com/Ph0rk0z/text-generation-webui-testing

https://github.com/Ph0rk0z/GPTQ-Merged

https://github.com/Curlypla/peft-GPTQ

Clone the 2 repos into text-generation-webui-testing/repositories

python cuda_setup.py install inside GPTQ-Merged to compile nvidia kernel.

python server.py --cai-chat --gptq-bits 4 --model opt-6b --autograd

Don't forget to get configs from: https://huggingface.co/facebook/opt-6.7b/tree/main

You only need the json files, don't forget merges.txt

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support