No IQ2_XSS on purpose?
Hello, sorry the bother you. I really appreciate your work!
Since I was pleasantly surprised how good the qwq quant was I wonder
if a IQ2_XSS version on gemma is or would be less successful?
gr
Yeah it was a conscious decision, have to put he cutoff somewhere 😅
What kind of card are you attempting to fit it on where 8.44GB is too big?
How much smaller could the IQ2_XSS be? If a 4060 with 8GB could run a Gemma 27B quant that might be interesting to someone but my guess is that IQ2_XSS would come in at ~8.1 GB or something anyway.
Yeah it was a conscious decision, have to put he cutoff somewhere 😅
What kind of card are you attempting to fit it on where 8.44GB is too big?
I understand thnx for replying :)
I rather not tell but since you asked😅, I am currently running qwq with full context partly on the cpu and on nvidia 1060 with 8gb of memory.
Most of the the time I even reach for q4_m.
Complex coding tasks can take a while, But it mainly fixes my python/JavaScript syntax and indentation errors.
ps. in no shape or form this a request, (well it was, but just interest in why)
as nkelly said the size reduction will be small anyway.
thnx