Qwen/Qwen3-1.7B · Fix chat template in case of multiple assistant messages and no thinking

May 29

Previously when messages contained multiple assistant messages, applying tokenizer template with enable_thinking=False would result in applying no thinking tokens to the first assistant message, but applying them to the second assistant message.

For example,

messages = [
    {'role': 'user', 'content': 'i am user 1'},
    {'role': 'assistant', 'content': 'i am assistant 1'},
    {'role': 'user', 'content': 'i am user 2'},
    {'role': 'assistant', 'content': 'i am assistant 2'},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=False,
    truncate=True,
    return_tensors='pt',
    enable_thinking=False

).squeeze(0)

Fix chat template in case of multiple assistant messages and no thinking313ac59d

sdpkjc

Jun 6

I ran into the same issue as well—after manually applying the chat_template from this PR, everything worked correctly. Hope this gets merged soon!

VityaVitalich

Jun 6

Happy this was useful! In case anyone needs the model that could be easily downloaded with this issue resolved.

https://huggingface.co/VityaVitalich/Qwen3-1.7B