why are the weights larger than instruct model?

by sigjhl - opened Jul 31

Discussion

sigjhl

Jul 31

This repo sums to about 19GB, different from all other Qwen3-30B 4bit quants. Is there a reason?

nightmedia

Owner Jul 31

•

edited Jul 31

The dwq is made with group size 32 for both high and low bits. Usually that defaults to 64, which results in a more compact compression but also a loss of precision and “depth” of the thought(the chain of thought doesn’t break so easily). The resulting dwq had a very low conversion loss, and should perform better than a straight q6

sigjhl

Jul 31

Oh I didn't know group size affects model size. Thank you for the details!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment