Not bad at all, but...

#2
by Hardeh - opened

This model performs pretty well, but it has a strange quirk of leaking "|" symbol randomly in the text. I'm using recommended temp, minP, topP and dry. But from time to time, model inserts "|" after each sentence.

Odd, I haven't noticed any leakage. What quant are you using to run the model? That's the first place I'd look for that sort of issue I think.

Might also be worth verifying you're using Mistral v7 Tekken template, and not accidentally running another template.

I'm using i1-Q5_K_M, and yes, using Mistral v7 template.

I'd still possibly give another quant a go.

https://huggingface.co/bartowski/zerofata_MS3.2-PaintedFantasy-v2-24B-GGUF

The only other thing I can think of is sampler settings, or something specific to the conversation. I gave the q5km quant from mradermacher a go for a few hours and wasn't able to reproduce the issue.

If you do keep getting the issue, a screenshot would be helpful.

Can confirm this, sometimes happen spontaneously to get "|" right at the end of sentence, on static Q6_K.
I fixed it with my implementation on python just removing when generations is done: OutText = OutText.replace("|", "") But this is more like temporary solution.

Hmm, that's two different quants and setups so I guess the issue has to be the model.

Thanks for the feedback, I'll keep an eye out for it.

Just a thought, what roleplay format do you use?

Actions: In plaintext
Dialogue: "In quotes"
Thoughts: In asterisks

Or

Actions: In asterisks
Dialogue: "In quotes"
Thoughts: In backticks (or not at all)

The first set is what the model is trained on, but the second set is what tends to be common in RP. It's possible the format could be responsible for these tokens leaking, since it takes the model somewhat off distribution.

I used only first format:

Actions: In plaintext
Dialogue: "In quotes"
Thoughts: *In asterisks*

But at long conversations model prone to forget about *thoughts* or use them rarely. (I tried to force with OOC but it's like 30% success rate.) But when I add new character it does work fine for a while.
Other settings I use:

repetition_penalty: 1.0-1.05 
repeat_last_n: 2000
temperature: 0.8-1.2
top_p: 0.95
min_p:  0.05

Model works best at context up to ~10K, after that it's starting to lose "quality" and it's hard to get something from beginning of context. But maybe this is because of 6bit.

Sign up or log in to comment