metadata
license: unknown
ehartford/WizardLM-7B-Uncensored quantized to 8bit GPTQ with act order + true sequential, no group size.
*For most uses, this probably isn't what you want.*
For 4bit with no act order or compatibility with old-cuda
(text-generation-webui default) see TheBloke/WizardLM-7B-uncensored-GPTQ
Quantized using AutoGPTQ with the following config:
config: dict = dict(
quantize_config=dict(bits=8, desc_act=True, true_sequential=True, model_file_base_name='WizardLM-7B-Uncensored'),
use_safetensors=True
)
See quantize.py
for the full script.
Tested for compatibility with:
WSL with GPTQ-for-Llama triton
branch.
Windows with AutoGPTQ on cuda
(triton deselected)
AutoGPTQ loader should read configuration from quantize_config.json
For GPTQ-for-Llama use the following configuration when loading:
wbits: 8
groupsize: None
model_type: llama\