Gptq version? Thanks!
#3
by
YaTharThShaRma999
- opened
Would be nice to have a gptq version so we could run it on limited vram. Thanks for making these models!
@sinanisler
i think i would reccomend the gguf version instead currently. Since exllama does not support multimodel gptq, llama.cpp with gguf is much much faster then gptq.
Heres a llava 7b gguf
https://huggingface.co/jartine/llava-v1.5-7B-GGUF
@YaTharThShaRma999
thank you didn't see this one
I will try it :)