IQ4_KSS

#9
by antolucib - opened

Following up on the reddit discussion, any chance the IQ4_KSS quant could be made for the non thinking model as well?

Great, thanks for putting this here which makes it easier for me to find, I'll start cooking it and update the repo this morning before getting too busy!

@antolucib

Okay, the IQ4_KSS is uploading and also decided to upload the IQ3_KS as well which is a bit smaller but still pretty smort. Perplexity graph and README is updated, files will complete uploading shortly!

@antolucib

Okay, the IQ4_KSS is uploading and also decided to upload the IQ3_KS as well which is a bit smaller but still pretty smort. Perplexity graph and README is updated, files will complete uploading shortly!

i just tested IQ4_KSS went very well but when i am testing same in thinking model its didt went very well like the model thinking too much. i will try to increse the pp and see how its behave and will update if i find anythink good.

update
i am gonna only use non thinking IQ4_KSS since thinking model is thinking too log so i am stick with q5 for that

@gopi87

Right, perhaps it was you or someone else who pointed out the thinking version will yap and yap sometimes unless you follow the instructions on the original Qwen3 model card: https://huggingface.co/Qwen/Qwen3-235B-A22B-Thinking-2507-FP8#best-practices regarding sampling and presence_penalty settings.

Sign up or log in to comment