Failed to infer a tool call example (possible template bug)
#2
by
Garf
- opened
Latest git llama.cpp
/build/bin/llama-server -m ~/llama/Magistral-Small-2506-UD-Q6_K_XL.gguf --host 0.0.0.0 -ngl 99 -fa -ctv q8_
0 -ctk q8_0 --temp 0.7 --min-p 0.01 --top-p 0.95 -c 40960 --jinja --top-k -1 --repeat-penalty 1.0
gives a warning about a possible template bug:
common_init_from_params: setting dry_penalty_last_n to ctx_size = 40960
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
Failed to infer a tool call example (possible template bug)
srv init: initializing slots, n_slots = 1
slot init: id 0 | task -1 | new slot n_ctx_slot = 40960
main: model loaded
main: chat template, chat_template: {{- bos_token }}
This is to be expected and we tested it as well and confirmed with the Mistral team.