Why are XL quants smaller than M quants?

#11
by ChuckMcSneed - opened

Q3_K_M: 112 GB
Q3_K_XL: 104 GB

Q4_K_M: 142 GB
Q4_K_XL: 134 GB

Doesn't look right.

Unsloth AI org

This is normal and it happens sometimes because we reduce some non necessary layer to be lower bit and some to be higher and sometimes this happens

But you can still use the Q4_K_M one which uses our calibration dataset

Which one is better quality?

Which one is better quality?

I'd say they perform equally. The Q4XL one might be tad bit faster

How do they compare to standard K_L quants?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment