<think> in generation output
I've been testing this model today and am impressed with the results. One issue I face is that the model generated output sometimes provides the steps, even though I specifically tell it not to. Anyone else facing the same issue?
Most likely, there would be an error in your system instruction / or are enabling thinking (by adding "Enable deep thinking subroutine." in the system prompt). Perhaps take another look?
When you enable thinking, the instructions you ask the model to follow will happen after the <think>...<\think>
tags. In other words, the model will not follow user instructions inside the <think>...<\think>
tags, which is sort of for its own process. The part after that is the response which is for the user.
I am using a quantized version with llama.ccp and enabled the thinking subroutine in the system prompt, so that might be it.
i'm using FP16 and have same issue (temp=0.6, top_p=0.95). even with system prompt exclusively "Enable deep thinking subroutine./n/n" - seems to depend on the style of task/question you ask.
i'm guessing as it's a fine-tune thing, might need more structure than just that one sentence in the training to differentiate, something more formal so the model learns better the 'switch'.
awesome model, great for quick coding no BS, nice work!