why are the weights larger than instruct model?

#1
by sigjhl - opened

This repo sums to about 19GB, different from all other Qwen3-30B 4bit quants. Is there a reason?

The dwq is made with group size 32 for both high and low bits. Usually that defaults to 64, which results in a more compact compression but also a loss of precision and “depth” of the thought(the chain of thought doesn’t break so easily). The resulting dwq had a very low conversion loss, and should perform better than a straight q6

Oh I didn't know group size affects model size. Thank you for the details!

Sign up or log in to comment