Mistral Small 24 B

#19
by HandsomeMagyar - opened

I've just tested it and the half of answers were some characters a- la Chinese/ Japanese. Is it provisional or something was done wrong?

it breaks with dry and xtc.

Mistral AI_ org

Can you put some examples of failure cases?

it doesnt break with dry.

XTC is a garbo sub-optimised tech made by and for gooners in a moment of general self-coital confusion around "ai slop" while skew exist (on exl2 at least, people in charge of various projects should think about implementing it everywhere turbo's graphs and code are undestandable even by a self-taught basic B**** like me : type skew in search bar on his codebase you're welcome) but there is 0 reason for it to bug either.

XTC is a percentage chance to purposefully select much less likely tokens when extremely likely ones are met. At least that's the gist of it. Basically it makes models (a LOT more) more stupid in hope of improving output's variety. (Yes, it's invented by & for those types of people). No model should be ever tested with this sampler.

That said, given how incredibly repetitive are Mistral models' prose (even outside of this use case), more so with every release, it's not particularly surprising that people use all kinds of sampling settings to attempt to remediate the low quality prose. This is kind of a problem unique to mistral models, even task/function-call oriented models like Cmd-R write in a less dull and repetitive fashion, which is kinda puzzling.

Sign up or log in to comment