Really Fun Model

#3
by isr431 - opened

I tested this model yesterday, and I’ve been really enjoying it! It feels like a breath of fresh air compared to the numerous nemo fine-tunes and merges. Qwen2.5 14b seems to grasp context more effectively and intelligently introduces relevant objects or events more frequently. It’s undoubtedly smarter than nemo in this regard.

I haven’t encountered any significant issues with its prose. While it can sometimes fall into repetition after a few messages, turning on dry helps. Generally, I use this model for the initial messages and then switch to a nemo-based model for continuation. Occasionally, it appends brackets ([]) at the end of a message or includes refusals within them (even when the message itself isn’t censored). However, SillyTavern filters these out, so it’s not a major problem.

Qwen feels like it has so much untapped potential, unlike nemo which seems oversaturated at this point. Between this and Eva Qwen2.5 14B, I prefer Kunou because it's more 'stable'.

I came to the opposite conclusion, tbh, for me EVA comes up on top. Maybe my settings are suboptimal or I'm doing something wrong, idk.

Kunou-1 had some repetition issues in my case. It can also generate very deterministic responses, similar to Nemo and Mistral Small derivatives, with little to no variation in swipes at the recommended settings. And sometimes it breaks down into repetition sequences or gibberish with lengthy conversation in the context. For instance, if I had a reply primer saying
as the
and the model always continued it with nonsense like
约定 bezpo翻译

While Eva-Qw2.5-14B v0.2 continued just fine from the same spot.

That said, knowing the author, I am sure these issues will be sorted out eventually. It's hard to always be the best, especially with a relatively understudied base model. Looking forward to the next iterations of the model or the new ones!

Sign up or log in to comment