jondurbin/bagel-34b-v0.4 · I can’t help but feel like it is worse.

Feb 24, 2024

While I haven't done extensive testing, I have been playing around with this in the last couple days. More precisely, I have been using https://huggingface.co/LoneStriker/bagel-34b-v0.4-GGUF on Koboldcpp.

When I compare generations from 0.2, 0.4 does seem much more intelligent but its creativity has take a significant hit to the point I thing it is flat out a worse model for creative writing.

I have been using ChatML on 0.2 and Lama2-chat on 0.4 I have tried ChatML to but it was even worse, I still need to test the other formats.

Also, Kcpps prompt template presets look a bit different then what is shown hear, maybe that could have coursed an issue.

jondurbin

Owner Feb 25, 2024

I suspect it's because I reduced the quantity of cinematika data (amongst other issues). I'm probably going to remake this model with the mix back closer to 0.2 (and without the chatml tokens, because they seem to be causing problems).

jondurbin

Owner Mar 4, 2024

Hey @Nycoorias , the "redo" of bagel-34b to address some of these issues is in progress. You can follow status of training here if you'd like: https://wandb.ai/jondurbin/bagel-34b-v0.5/runs/f3t8wcwj?workspace=user-jondurbin

In the meantime, you could give airoboros-34b-3.2 a try: https://huggingface.co/jondurbin/airoboros-34b-3.2

Nycoorias

Mar 4, 2024

Thanks @jondurbin , I will comment on the new model as soon as gguf becomes available.

P.S.
Are comments like this one helpful at all?

jondurbin

Owner Mar 4, 2024

P.S.
Are comments like this one helpful at all?

100% - I absolutely need and appreciate feedback, it's useful whether positive or negative.