deepseek-ai/DeepSeek-R1-0528 · Update tokenizer

May 29

•

add "{% if enable_thinking is defined and enable_thinking is false %}{{'<think>\n\n</think>\n\n'}}{% endif %}"

Add support for empty think block injection in chat template

Description

This PR adds support for the enable_thinking parameter in the chat template to control chain-of-thought reasoning, achieving feature parity with Qwen3.

Why it's needed

Many inference frameworks (SGLang, vLLM) and applications need to control whether models use reasoning steps. The enable_thinking parameter provides a standardized way to:

Improve inference speed when reasoning isn't needed
Ensure consistent output structure for parsing
Match behavior across different model families

Usage

# With thinking enabled (default behavior - unchanged)
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True  # or omit for default
)
# Output: <｜Assistant｜>

# With thinking disabled (new behavior)
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)
# Output: <｜Assistant｜><think>\n\n</think>\n\n

Implementation

The change adds a single line to inject an empty think block when enable_thinking=False:

{% if enable_thinking is defined and enable_thinking is false %}{{'<think>\n\n</think>\n\n'}}{% endif %}

This follows Qwen3's approach where:

enable_thinking=False strictly disables reasoning by injecting an empty think block
The empty block signals to the model to skip chain-of-thought generation
Recommended for efficiency-critical scenarios

Backward Compatibility

Fully backward compatible - only affects behavior when enable_thinking=False is explicitly set.

Update tokenizer_config.json3656c542

erichartford changed pull request status to closed May 29