mistralai/Magistral-Small-2506

Ask the model a question related to the <think> token. For example: "Write a regex to parse <think></think> tags. Provide an example." The model's thinking output might contain <think> </think> tags before the real </think> tag, which could cause parsing issues during streaming.

As a solution, the reasoning output should be wrapped in separate <think> and </think> tokens, while any internal thought processes can still use <th for html tag. If you have any better ideas let me know.

Btw, the Qwen3 model has the opposite problem. It has special <think></think> tokens, but it doesn't have regular tokens to show when it's just thinking, like <th.

Here's what the <think> and </think> tokens look like:

{
  "id": 151667,
  "content": "<think>",
   ...
},
{
  "id": 151668,
  "content": "</think>",
  ...
}

mistralai
/

Magistral-Small-2506

Think token