First token seems bugged out

by elpirater312 - opened Apr 30

Apr 30

•

The first token is not the initial think "" tag and starts spitting out random words before responding after "closing" the thinking tag (which doesn't appear as the first token), after second or third prompt it seems to go back to normal with the thinking tags. Used recommended Qwen3 settings (temperature 0.6).

urtuuuu

Apr 30

•

edited Apr 30

In LM Studio Failed to parse Jinja template: Parser Error: Expected closing statement token. OpenSquareBracket !== CloseStatement.

In llama cli: common_chat_templates_init: failed to parse chat template (defaulting to chatml): Expected value expression at row 18, column 30: {%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %} {%- for message in messages[::-1] %} ^ {%- set index = (messages|length - 1) - loop.index0 %}

And it speaks Chinese for some reason.

bartowski

Owner May 1

Ah right right .. I didn't apply the template fix from the other models here :')

AdventureAl

May 1

tried Q6_K_L, Q5_k_L, Q5_K_M all have the same issues
it speaks Chinese a lot, especially in the very first answer,
not very alliterated (if at all)
they all like to add \u200B char randomly in the code, very annoying. it's invisible for the eye...
Didn't encounter that problem with 30B version
template is incorrect (not a big deal, one can use from qwen2.5 or from any working one, just copy paste)
do not follow the previous corrections... so instead of fixing existent ones they create new code with new issues

not usable

urtuuuu

May 1

Someone said after 1 or 2 outputs, it starts behaving normally, and i can confirm that. But you have to replace Jinja template to a correct one, i did it in LM Studio.

elpirater312

May 4

mlabonne uploaded new safetensors and updated the config.json a few hours ago, maybe it's fixed now? I don't have enough space right now to download them and quantize btw

bartowski

Owner May 4

hmm looks like he may have accidentally uploaded the 32B to the 14B or something, cause the new ones are 1 of 13, versus 1 of 6, and they're the same size

so the new one is at least 60gb, versus the original at 30gb

@mlabonne

mlabonne

May 4

No, it's not fixed yet. The recipe I made for the smaller models doesn't work for the 14B. It requires more subtle work to abliterate it properly without lobotomizing the model.

It's still W.I.P. but good progress!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment