Sunfall vs Daybreak models

#1
by OrangeApples - opened

Hi @crestf411 . If you have two models out, each using Sunfall and Daybreak LoRAs, that are both based on the same model (let's say L3), which one would you generally recommend for roleplaying in SillyTavern? What main pros and cons would models have over each other?

Sunfall is Daybreak + some extra instruction tuning with tags and such. It has been specifically taught to write specific stories based on a synopsis, tags, character descriptions, etc. and the training was done by copying the Silly Tavern prompt formatting. In theory, Sunfall should always be better than Daybreak. In practice, Sunfall is still a work in progress, where I am still figuring out how to apply the instructions, which sources to use for the instruction style data, and so on, so some people still prefer Daybreak.

Thanks for the clear and concise answer. Big fan of crestf411/L3-70B-sunfall-abliterated-v0.2, by the way. After using it, I find it hard to return to other models with such frequent GTPisms.

Thank you for the feedback! Unfortunately I get very little of that so it's hard to tell sometimes! :)
I am working my way back up to the big models again, so hopefully will have an update for you sometime soon.

That's great news! Looking forward to your next updates.
Just an observation I have after a bit more testing, I noticed crestf411/L3-70B-sunfall-abliterated-v0.2 has the EOS token triggered quite early in the conversation relative to other models. When this happens, I usually switch to something like Wizard-LM-2-8x22B to continue the chat without the EOS triggering. I'm not sure if all Sunfall models are like this or what exactly is causing this behavior, but perhaps it's worth mentioning.

Edit: One more thing, the GPTisms aren't totally gone, especially the "shivers down the spine" one, but it pops up significantly less when using your Sunfall model + the Diamond Law lorebook

Yes one of the issues I had early on was short response lengths. It’s been addressed in the smaller variants so should be gone in the next release.

FYI I am finishing up a new 70B model based on Llama 3.1 70B. The LoRA is already up here, and I do plan to upload a merge with the 3.1 70B Instruct model, which does still give more refusals compared to the abliterated variant you mentioned.

Just saw the new model posted on your page about 3 mins ago. Great to hear that you're still working on these! I'm optimistic to see how much better these Llama 3.1 70B models are compared to the old Llama 3. Will definitely give it a try once the GGUFs are out.

Please do. I should warn you though, they are a little fickle. See the model card. I had to drop temp down a bit to make it not make a bunch of dumb mistakes. The model itself is really smart, but with a too high temp, it gets flustered, I guess?

Got it. Thanks for the heads up on the temp πŸ‘

Sign up or log in to comment