Question about 1-bit quant

by ThomasBaruzier - opened Jul 29, 2024

Discussion

ThomasBaruzier

Jul 29, 2024

•

edited Jul 29, 2024

Hello,

You are claiming your 1 bit quant is "custom".
Could you please elaborate about how it was made, and if it is higher quality than a traditional IQ1_S or IQ1_M quant?

Thanks.

nisten

Owner Jul 29, 2024

only ~92% of the weights are 1bit,
so had to rewrite llama.cpp to do that custom quant,
also have not uploaded them yet

ThomasBaruzier

Jul 29, 2024

Thank you for the answer
If you plot the model size vs PPL for the two closest quants, would this custom quant yield a lower, equal, or higher perplexity? If there is a real benefit, it might be worth sharing your findings in the llama.cpp repo?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment