brucethemoose/Yi-34B-200K-DARE-merge-v5-4bpw-exl2-fiction · Model sometimes replies to older msgs in context as if it's the most recent msg

So the model has been great in all areas (excellent memorization, remembers clothes/body details). The problem is it sometimes replies to older msgs in my context as if it's the most recent msg. This doesn't happen until about 17k context in. For reference, I'm using SillyTavern UI with group conversations using very low temp (as recommended), and a basic min-P approach with no Mirostat and minimal rep penalty. I tried experimenting with different Instruct presets, but didn't seem to really help. It doesn't do this on Venus-120b-v1.0-4.5bpw-h6-exl2 or lzlv_70b_fp16_hf-5.0bpw-h6-exl2.

If anyone runs into this same issue and knows a solution, I'd be curious if there even is a solution. I've basically had to give up on the model.

Hmmm, I don't know how sillytavern is formatting things internally. I tend to run the model in notebook mode in one of two formats.

With labeled characters and a narrator:

SYSTEM: (Description of story setting, lore and and characters)...
USER: Continue the story below:
ASSISTANT: Narrator: It was a dark and stormy night...
Char1: Blah
Char2: Blah blah?
Char1: Blah.
Narrator: Blah punched blah...

And so on. Or alternatively:

SYSTEM: (Description of story setting, lore and and characters)...
USER: Continue the story in a novel style format below:
ASSISTANT: It was a dark and stormy night...
...

And it always responds to the most recent line.

I haven't actually tried a USER/ASSISTANT block on each line with this merge yet.

Anyway, get SillyTavern to print the story in verbose mode, maybe post an abbreviated version of the formatting? Or just take a look yourself. Yi is indeed very sensitive to formatting at long context.

Or it might actually be some bug in SillyTavern where its truncating the context to 16K?