Promising!
Trying this out and it seems pretty darn good! I'll probably try this out the rest of the day.
edit: At about 4k the output length cuts nearly in half, and at 8k context tokens i see a repeat from other models where words and qualifiers are missing, though not enough to damage the output but it does stand out.
Definitely has higher quality output when the context is small. Alas the missing "the, he, she" and other possessive qualifiers makes the output as roleplays get longer a little annoying to have to constantly correct. Not bad for a 24B model. Would probably be fine for rewriting small blocks of paragraphs or translation if you gave it to do. Or generating ideas, characters, and other things under the 4k context range.
I haven't tested it, but the way the model is acting suggests if you give it too much, it will output what you give it rather than doing what you ordered it to do (say rewrite a story or analyze a long story, etc).
Thanks for the detailed feedback. It would be interesting to know which settings and which quant you used when testing the model. In my use, I noticed that the model works better with the old mistral preset than with mistral-tekken or mistral v7.
Alas the missing "the, he, she" and other possessive qualifiers makes the output as roleplays get longer a little annoying to have to constantly correct.
English is not my native language, so i probably don't notice some grammatical errors, but i definitely haven't encountered this. I use q4_k_m.
Thanks for the detailed feedback. It would be interesting to know which settings and which quant you used when testing the model. In my use, I noticed that the model works better with the old mistral preset than with mistral-tekken or mistral v7.
Alas the missing "the, he, she" and other possessive qualifiers makes the output as roleplays get longer a little annoying to have to constantly correct.
English is not my native language, so i probably don't notice some grammatical errors, but i definitely haven't encountered this. I use q4_k_m.
I use the q6_k (32B or smaller) or q4_k_m (70B) depending on size of models. I see i was using the Q6_K version.
The missing words (he, she, the, s (at ends of some words)) is not a unique problem, i noticed it on other 8B-24B models at about the same place, after you hit about 8k context or larger. It starts to sound more like bulletpoint presentation 'went to mall' instead of 'he went to the mall'. It is totally readable within the context, but just not quite as nice flowing. And if you don't edit to add those words in, the likelihood of them to be missing in followups greatly increase.
Your case is quite unusual. I just tested the model with up to 24k context, and the words were in place, in the correct form. If you are sure that your quant is not broken, I advise you to look for the reason in the sampler settings, preset or system prompt. I'm leaving the discussion open, but I have nothing more to say, close it if you want. Good luck and have a nice RP.
Well on SillyTavern, the temperature is 0.65-0.75, and top_P is 1. I'm not sure about other options that aren't on display.
Maybe it will be certain RP's or writing that cause it going outside the original scope/layout; I'm not sure.