Gptq version? Thanks!

by YaTharThShaRma999 - opened Oct 7, 2023

Discussion

YaTharThShaRma999

Oct 7, 2023

Would be nice to have a gptq version so we could run it on limited vram. Thanks for making these models!

Yhyu13

Oct 8, 2023

@TheBloke Check this one out, bro

sinanisler

Dec 10, 2023

@TheBloke

YaTharThShaRma999

Dec 11, 2023

@sinanisler i think i would reccomend the gguf version instead currently. Since exllama does not support multimodel gptq, llama.cpp with gguf is much much faster then gptq.
Heres a llava 7b gguf
https://huggingface.co/jartine/llava-v1.5-7B-GGUF

sinanisler

Dec 11, 2023

@YaTharThShaRma999
thank you didn't see this one

I will try it :)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment