Q4/Q5 models?

#2
by enterausername - opened

Hello! I was really interested in testing this out, but it exceeds my hardware capabilities as a Q8 model. Looking into cloud options and even then a Q4 or Q5 version would seem to be best.

Would you consider releasing a lower quantized version? Couldn't find the model weights anywhere to quantize it myself. I'd really appreciate it!

Hello!

For some reason, I've had difficulties getting Gemma 3 to quantize to Q4_0, but I've seen other people do it, so it's definitely possible.

Let me look into this after I get out of work and I'll get back to you! I'm sure I can figure it out, and if so, I'll add it to this repo and update you here.

  • Reed

Sign up or log in to comment