Upload tokenizer_config.json
Bug Fix Summary: Corrected Jinja Chat Template for System Prompt Handling in vLLM
Problem: The original chat_template in tokenizer_config.json for the Cogito model caused a TypeError: can only concatenate str (not "list") to str when processed by vLLM if the input conversation included a system prompt, particularly when followed by tool interactions.
Root Cause Analysis: The error originated from the template's logic attempting to handle all messages uniformly within a single loop ({% for message in messages %}). This led to issues when:
Processing the first message if it was a system prompt, especially when concatenating its content or handling tool-related preamble.
Performing lookahead/lookbehind checks for consecutive tool role messages using loop.index on the potentially modified message list while simultaneously indexing into the original messages list, leading to index mismatches or attempting to concatenate incompatible types near list boundaries.
Solution Implemented: The chat_template was refactored with the following key changes:
Explicit First Message Handling: Logic was added before the main loop to check if messages[0].role == 'system'.
Separate System Prompt Processing: If the first message is system, its content (including logic for enable_thinking and the tools preamble) is rendered outside the main loop. Checks (is string, is iterable) were added to safely handle potentially list-like messages[0].content.
Conditional Loop Iteration: A loop_messages variable is now used for the main loop ({% for message in loop_messages %}). This variable is set to messages[1:] if the first message was a system prompt (effectively skipping it in the loop), otherwise it's set to the full messages list.
Simplified Main Loop Condition: The if condition within the main loop for handling user or system roles was simplified by removing the and not loop.first check, as the first system message is handled separately.
Robust Tool Response Boundary Checks: The logic within elif message.role == "tool" was revised to calculate the message's index (original_index) relative to the original messages list. Explicit boundary checks (original_index > 0, next_original_index < num_original_messages) are now performed before accessing messages[original_index - 1] or messages[next_original_index] to determine if adjacent messages are also tool roles, preventing out-of-bounds errors or type mismatches.
Outcome: With these changes, the modified chat_template correctly processes conversations containing system prompts, including those involving tool interactions, without raising a TypeError in vLLM.