chat_template in tokenizer_config.json?
Hello, and thank you for making this model available!
I had an issue loading it with the llm
Python CLI tool v0.26 and llm-mlx
v0.4, using the following command:
llm mlx download-model mlx-community/DeepSeek-R1-0528-Qwen3-8B-8bit
It failed with:
File "/.../.venv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 1754, in get_chat_template
raise ValueError(ValueError:
Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed!
For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at
https://huggingface.co/docs/transformers/main/en/chat_templating
I looked up the chat_templating docs linked and noticed a separate file named chat_template.jinja
in the repo (link), so I figured this was what was missing from tokenizer_config.json
.
I found tokenizer_config.json
in my local HF cache at ~/.cache/huggingface/hub/models--mlx-community--DeepSeek-R1-0528-Qwen3-8B-8bit/blobs
, and edited c4657dc1d512391538fdb16dd905945ad376062f
(the blob for tokenizer_config.json
) to add a "chat_template"
key based on the Jinja template: I simply escaped double-quotes and escaped literal \n
as \\n
, ending up with a long line that I added to the JSON:
"chat_template": "{% if not add_generation_prompt [...] {{'<|Assistant|>'}}{% endif %}",
I wasn't sure I could really edit this file given that it looked like a content-addressable file name, but after this edit I was able to re-run llm mlx download-model
to completion and subsequently to use the model locally:
$ llm chat -m mlx-community/DeepSeek-R1-0528-Qwen3-8B-8bit
Chatting with ...
Should I have used different versions of these Python CLI tools? Is there one that would have found the separate Jinja file and understood what to do with it? Or should the chat_template
key have been in tokenizer_config.json
in the first place? Looking at the same files in other models I can see a chat_template
key present, e.g. here for Devstral-Small-2505-8bit
or here for Qwen3-4B-8bit
.
I certainly don't know enough about these files to tell what they should contains, but I imagine others might encounter the same problem given this will probably be a popular model. It would be great if someone could comment here on what went wrong: is the JSON file missing this key when it should have it? Or did I do something wrong, or missed a step, and if so what was it?
Thanks!
Hey, I hope you're doing well! I think you might be using the wrong pip packages. Try doing this and see if that helps:
pip install mlx-lm
pip show mlx-lm
Should return something like this:
Name: mlx-lm
Version: 0.24.1
Summary: LLMs on Apple silicon with MLX and the Hugging Face Hub
Home-page: https://github.com/ml-explore/mlx-lm
Author: MLX Contributors
Author-email: [email protected]
License: MIT
Location: /Users/benshankles/Documents/GitHub/MLX/.venv/lib/python3.11/site-packages
Requires: jinja2, mlx, numpy, protobuf, pyyaml, transformers
Required-by: mlx-vlm
Then:
mlx_lm.chat --model mlx-community/DeepSeek-R1-0528-Qwen3-8B-8bit
Let me know if that works!