chat_template in tokenizer_config.json?

#1
by nff - opened

Hello, and thank you for making this model available!

I had an issue loading it with the llm Python CLI tool v0.26 and llm-mlx v0.4, using the following command:

llm mlx download-model mlx-community/DeepSeek-R1-0528-Qwen3-8B-8bit

It failed with:

File "/.../.venv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 1754, in get_chat_template
  raise ValueError(ValueError:
Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed!
For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at
https://huggingface.co/docs/transformers/main/en/chat_templating

I looked up the chat_templating docs linked and noticed a separate file named chat_template.jinja in the repo (link), so I figured this was what was missing from tokenizer_config.json.

I found tokenizer_config.json in my local HF cache at ~/.cache/huggingface/hub/models--mlx-community--DeepSeek-R1-0528-Qwen3-8B-8bit/blobs, and edited c4657dc1d512391538fdb16dd905945ad376062f (the blob for tokenizer_config.json) to add a "chat_template" key based on the Jinja template: I simply escaped double-quotes and escaped literal \n as \\n, ending up with a long line that I added to the JSON:

  "chat_template": "{% if not add_generation_prompt [...] {{'<|Assistant|>'}}{% endif %}",

I wasn't sure I could really edit this file given that it looked like a content-addressable file name, but after this edit I was able to re-run llm mlx download-model to completion and subsequently to use the model locally:

$ llm chat -m mlx-community/DeepSeek-R1-0528-Qwen3-8B-8bit 
Chatting with ...

Should I have used different versions of these Python CLI tools? Is there one that would have found the separate Jinja file and understood what to do with it? Or should the chat_template key have been in tokenizer_config.json in the first place? Looking at the same files in other models I can see a chat_template key present, e.g. here for Devstral-Small-2505-8bit or here for Qwen3-4B-8bit.

I certainly don't know enough about these files to tell what they should contains, but I imagine others might encounter the same problem given this will probably be a popular model. It would be great if someone could comment here on what went wrong: is the JSON file missing this key when it should have it? Or did I do something wrong, or missed a step, and if so what was it?

Thanks!

MLX Community org

Hey, I hope you're doing well! I think you might be using the wrong pip packages. Try doing this and see if that helps:

pip install mlx-lm
pip show mlx-lm

Should return something like this:

Name: mlx-lm
Version: 0.24.1
Summary: LLMs on Apple silicon with MLX and the Hugging Face Hub
Home-page: https://github.com/ml-explore/mlx-lm
Author: MLX Contributors
Author-email: [email protected]
License: MIT
Location: /Users/benshankles/Documents/GitHub/MLX/.venv/lib/python3.11/site-packages
Requires: jinja2, mlx, numpy, protobuf, pyyaml, transformers
Required-by: mlx-vlm

Then:

mlx_lm.chat --model mlx-community/DeepSeek-R1-0528-Qwen3-8B-8bit

Let me know if that works!

Sign up or log in to comment