Amazing, thank you so much. Small question :)

by CyberTimon - opened


Thank you very much for these quants. They work amazing. I would LOVE to see Command-R Plus (the 104b variant) getting quantized with this method. On my 64GB M1 Max it would make an extremely huge difference because I can only load it in 3bit.

Thanks again and kind regards,

Thank you Timon for your feedback !
RE: CommandR Plus;
Currently this is beyond my hardware's ability, as I quant/build everything locally.

CommandR plus is on my list ; and should it be possible to "upgrade" it ; I will post it and ping you via this post.

Maybe you could tell me the secret to quantize it this good, then I can do it on my own. I have the compute to do this. Would be very thankful. Are you ok with this?

The NEO Class is a by hand / piece by piece constructed Imatrix dataset. It was constructed based on 120+ labs and 240+ hours of testing, research (testing and evaluating 50+ imatrix dataset in use) and a lot of trial and error. They are complex and difficult to work with to put it mildly.

Frankly this (tech) is still in it's infancy and there is more development ahead, especially considering the effect(s) with X quants built on NEO Class datasets / models.
(The first Command-R 35B xq200 "X quant" was just released yesterday after evaluating the "real world" results of the upgrade first.)
With these factors in mind, I will not be sharing these or how they work at this time.

That being said, I will share this:

There is a LOT of room for improvement with current Imatrix datasets in use (I mean all of them, because I tested and modified them and "witnessed" the improvement(s)).
The current "construction" of Imatrix datasets is almost a "throw enough mud at the wall and something may work" mentality.
Which is a form of insanity, from my point of view, considering the precision nature of LLMs.
Plus some Imatrix datasets in use actually hurt performance (relative to standard quants).
That last point was one of reasons I pursued the "Neo Class" datasets... for my own use cases as well as others.

Thanks for your answer - that's understandable. Still thanks for your service for the community ✌️

Sign up or log in to comment