Failed to infer a tool call example (possible template bug)

#2
by Garf - opened

Latest git llama.cpp

/build/bin/llama-server -m ~/llama/Magistral-Small-2506-UD-Q6_K_XL.gguf --host 0.0.0.0 -ngl 99 -fa -ctv q8_
0 -ctk q8_0 --temp 0.7 --min-p 0.01 --top-p 0.95 -c 40960 --jinja --top-k -1 --repeat-penalty 1.0

gives a warning about a possible template bug:

common_init_from_params: setting dry_penalty_last_n to ctx_size = 40960                                                                    
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)                                 
Failed to infer a tool call example (possible template bug)                                                                                
srv          init: initializing slots, n_slots = 1                                                                                         
slot         init: id  0 | task -1 | new slot n_ctx_slot = 40960                                                                           
main: model loaded                                                                                                                         
main: chat template, chat_template: {{- bos_token }}                                                                                       
Unsloth AI org

This is to be expected and we tested it as well and confirmed with the Mistral team.

Sign up or log in to comment