Struggling with this in ollama
#4
by
Klopez
- opened
I have pulled the model as shown here:
https://docs.unsloth.ai/basics/magistral-how-to-run-and-fine-tune
and it seems to run ok, when i load it up in openwebui it either continues to think and never stops, or it doesnt think at all. I have tried modifying many params, forcing temp, forcing min_p and other params, trying the system prompt and even tried pulling the raw gguf, and applying the model file to it adding in the ollama template. Now its like a completions model and doesnt think?
Here is my model file:
from Magistral-Small-2506-UD-Q6_K_XL.gguf
SET PARAMETER temperature 0.7
SET PARAMETER top_p 0.95
TEMPLATE {{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1}}
{{- if eq .Role "system" }}[SYSTEM_PROMPT]{{ .Content }}[/SYSTEM_PROMPT]
{{- else if eq .Role "user" }}[INST]{{ .Content }}[/INST]
{{- else if eq .Role "assistant" }}
{{- if and $.IsThinkSet (and $last .Thinking) -}}
<think>
{{ .Thinking }}
</think>
{{- end }}
{{- if .Content }}{{ .Content }}
{{- end }}
{{- if not (eq (len (slice $.Messages $i)) 1) }}</s>
{{- end }}
{{- end }}
{{- end }}
SYSTEM """A user will ask you to solve a task. You should first draft your thinking process (inner monologue) until you have derived the final answer. Afterwards, write a self-contained summary of your thoughts (i.e. your summary should be succinct but contain all the critical steps you needed to reach the conclusion). You should use Markdown and Latex to format your response. Write both your thoughts and summary in the same language as the task posed by the user.
Your thinking process must follow the template below:
<think>
Your thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate a correct answer.
</think>
Here, provide a concise summary that reflects your reasoning and presents a clear final answer to the user.
Problem:"""
in ollama I get:
Testing the ollama version works as expected, but was looking for q6 quant.
Any ideas why this would happen?
Have you tried the Q8 quant and see if the same issue occurs?