24B version?

#2
by AuriAetherwiing - opened

Hello, Gryphe!
Mistral Small 24B has released today, and I feel like it's a major upgrade over 22B, hell, over Qwen2.5-32B even. I'm a big fan of Pantheon models and their unique flavor and personality, and I would just love to see a 24B Pantheon. Any chance that this might happen?

Totally! No idea about the "when" just yet but I don't think it'll be that long.

So, I gave it a couple shots but something's seriously wrong with this new model! It's dry AF, it's full of logic flaws... Heck, Nemo's smarter then this one! I've sent some inquiries to Mistral about this and I'm hoping this wasn't some desperation move by them due to the entire Deepseek debacle.

No 24B for now, sadly. Y'all know I won't release things that aren't good.

Interesting.

I mean all models from Mistral (directly/instruct) have shown to be extremely repetitive in their output in my experience. While it's certainly not better with this released, I don't find it worse either. Been running it since release (mostly for serious stuff, but a bit of creative writing too). It seems to do better in a format: narration "dialogs", than with narration dialogs. Still repetitive, but a bit less so. I'm a bit curious about the 'dumber' part. You'd be willing to elaborate?

So, I gave it a couple shots but something's seriously wrong with this new model! It's dry AF, it's full of logic flaws... Heck, Nemo's smarter then this one! I've sent some inquiries to Mistral about this and I'm hoping this wasn't some desperation move by them due to the entire Deepseek debacle.

No 24B for now, sadly. Y'all know I won't release things that aren't good.

I had the same issue when I tried one of the first gguf quants that were uploaded on hf but when I tried the imatrix quants from mradermacher it was suddenly working properly. I used the same version of koboldcpp for both so there wasn't a llama.cpp update or anything else different than the quant itself.

So, I gave it a couple shots but something's seriously wrong with this new model! It's dry AF, it's full of logic flaws... Heck, Nemo's smarter then this one! I've sent some inquiries to Mistral about this and I'm hoping this wasn't some desperation move by them due to the entire Deepseek debacle.

No 24B for now, sadly. Y'all know I won't release things that aren't good.

Hmm, are you sure you didn't use LM-Studio's broken GGUFs? I had similar problems with them specifically. BF16 weights were fine, and for sure much smarter than MN. Bartowski's quants appear to be fine, as well.

So, I gave it a couple shots but something's seriously wrong with this new model! It's dry AF, it's full of logic flaws... Heck, Nemo's smarter then this one! I've sent some inquiries to Mistral about this and I'm hoping this wasn't some desperation move by them due to the entire Deepseek debacle.

No 24B for now, sadly. Y'all know I won't release things that aren't good.

Hmm, are you sure you didn't use LM-Studio's broken GGUFs? I had similar problems with them specifically. BF16 weights were fine, and for sure much smarter than MN. Bartowski's quants appear to be fine, as well.

Yes exactly that's what I meant. I think I also used quants from LM-Studio because they were one of the first ones being done. I'm not sure if you are right about Bartowski's quants being fine though because as far as I remember the LM-Studio quants were done and uploaded by him for LM-Studio but maybe he's changed them or did something differently with the quants he uploaded on his own account. What I'm sure about is that the imatrix quants from mradermacher work (at least Q5_K_M and the Q6_K) and they're not comparable quality wise with the ones I tried before.

Sign up or log in to comment