This is the first Qwen3 A3B model that doesnt immediately start repeating itself

#2
by SuperbEmphasis - opened

The 30B A3B model is like the perfect size and idea for a 24GB graphics card that so many of us have.

These Qwen3 models are smart, but they seem to have been designed to take information and answer with facts. When using other Qwen3 30 A3B fine tunes, it would respond as the character with identical sentences in each response. Which sort of makes sense, as the response to the "facts" in the context would produce similar "facts" in the response.

However your finetune is one of the first that seems to not fall immediately into that cycle! I think the Qwen3 have a ways to go before they have a place in RP, but just wanted to say i was quite impressed that you seemed to be the first one to at least begin to solve this problem!

I am actually trying a finetune on top of your fine tune, but my small 24GB gpu might take a while to complete using unsloth. Great work! I look forward to seeing how far you can push the model!

Let me know how your tune-upon-a-tune turns out! Maybe it's just a matter of more epochs or somethin'... MoE models are weird!

I will say when I was testing the 4b model, I trained it with the default unsloth learning rate, and it was still repetitive. But then I tried trippling the learning rate and I swear it seemed a good bit better...

Sign up or log in to comment