NOTE: Safetensor 4bit quant will be uploaded within the day. Cheers.

This is a GPTQ 4 bit quant of ChanSung's Elina 33b. This is a LlaMa based model; LoRA merged with latest transformers conversion. Quantized with GPTQ, --wbits 4 --act-order --true-sequential --save_safetensors c4.

128 groupsize was not used so those running this on a consumer GPU with 24GB VRAM can run it at full context (2048) without any risk of OOM.

Original LoRA: https://huggingface.co/LLMs/Alpaca-LoRA-30B-elina

Repo: https://huggingface.co/LLMs

Likely Author: https://huggingface.co/chansung

Downloads last month
22
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.