Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This is https://huggingface.co/ehartford/Wizard-Vicuna-30B-Uncensored, merged with https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test, then quantized to 4bit with AutoGPTQ. There are two quantized versions. One is a plain 4bit version with only act-order and no groupsize. The other is an experimental version using groupsize 128, act-order, and kaiokendev's ScaledLLamaAttention monkey patch applied during quantization, the idea being to help the calibration account for the new scale. It seems to have worked as it improves by around 0.04 ppl vs the unpatched quant - maybe not worth the trouble, but it's better so I'll put it up anyway.

Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.