NOTE: Safetensor 4bit quant will be uploaded within the day. Cheers.
This is a GPTQ 4 bit quant of ChanSung's Elina 33b. This is a LlaMa based model; LoRA merged with latest transformers conversion. Quantized with GPTQ, --wbits 4 --act-order --true-sequential --save_safetensors c4.
128 groupsize was not used so those running this on a consumer GPU with 24GB VRAM can run it at full context (2048) without any risk of OOM.
Original LoRA: https://huggingface.co/LLMs/Alpaca-LoRA-30B-elina
Repo: https://huggingface.co/LLMs
Likely Author: https://huggingface.co/chansung
- Downloads last month
- 22
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.