README.md · Squish42/WizardLM-7B-Uncensored-GPTQ-act_order-8bit at 03b47c5f71be3cedb0142fd0094bab08f75aa63f

metadata

license: unknown

ehartford/WizardLM-7B-Uncensored quantized to 8bit GPTQ with act order + true sequential, no group size.

*For most uses, this probably isn't what you want.*
For 4bit with no act order or compatibility with old-cuda (text-generation-webui default) see TheBloke/WizardLM-7B-uncensored-GPTQ

Quantized using AutoGPTQ with the following config:

config: dict = dict(
    quantize_config=dict(bits=8, desc_act=True, true_sequential=True, model_file_base_name='WizardLM-7B-Uncensored'),
    use_safetensors=True
)

See quantize.py for the full script.

Tested for compatibility with: WSL with GPTQ-for-Llama triton branch. Windows with AutoGPTQ on cuda (triton deselected)

AutoGPTQ loader should read configuration from quantize_config.json For GPTQ-for-Llama use the following configuration when loading:
wbits: 8
groupsize: None
model_type: llama\