ibm-granite/granite-20b-code-instruct-accelerator

lssj14

21 days ago

What is the reasoning token of granite model?
For example it is set as , in Qwen and DeepSeek.

IBM Granite org 20 days ago

Hi @lssj14 , thanks for your interest in the Granite family. A couple of notes about your question:

This model is an accelerator (speculative decoder) for the Granite Code 20b model which does not support thinking
None of the Granite Code family (Granite 2.x) support thinking
In the Granite 3.x family, thinking was introduced in 3.2 (eg granite 3.2 2B)
There is no single token for the 3.x family that stimulates thinking. Instead, you can enable/disable it via the thinking=True flag to the apply_chat_template function in transformers. Underneath, this translates to adding a section of prewritten system prompt to the user's system prompt, so if using the model behind a hosted chat API endpoint that does not support the thinking argument (eg an OpenAI REST API), you can enable thinking by adding the respective piece of system prompt to your system prompt.

NOTE: The system prompt snippet is different for 3.2 vs 3.3, so make sure to check the appropriate chat template to find the right snippet

lssj14

5 days ago

@gabegoodhart
Thank you for your answers in detail!

lssj14 changed discussion status to closed 5 days ago