Why are XL quants smaller than M quants?

#11

by ChuckMcSneed - opened 1 day ago

1 day ago

Q3_K_M: 112 GB
Q3_K_XL: 104 GB

Q4_K_M: 142 GB
Q4_K_XL: 134 GB

Doesn't look right.

Unsloth AI org 1 day ago

This is normal and it happens sometimes because we reduce some non necessary layer to be lower bit and some to be higher and sometimes this happens

But you can still use the Q4_K_M one which uses our calibration dataset

1 day ago

Which one is better quality?

Unsloth AI org about 18 hours ago

Which one is better quality?

I'd say they perform equally. The Q4XL one might be tad bit faster

about 8 hours ago

How do they compare to standard K_L quants?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment