FIXED: Template bug fixed in llama.cpp

by sovetboga - opened Apr 29

Apr 29

Hello, this version takes this PR into account?
This PR fixed a template bug but it requires regeneration.
https://github.com/ggml-org/llama.cpp/commit/ced44be34290fab450f8344efa047d8a08e723b4
Thanks

shimmyshimmer

Unsloth AI org Apr 29

•

edited Apr 30

Hello, this version takes this PR into account?
This PR fixed a template bug but it requires regeneration.
https://github.com/ggml-org/llama.cpp/commit/ced44be34290fab450f8344efa047d8a08e723b4
Thanks

Edit: Disregard my comment I got confused with Qwen3 lol

Dampfinchen

Apr 30

•

edited Apr 30

Hello, this version takes this PR into account?
This PR fixed a template bug but it requires regeneration.
https://github.com/ggml-org/llama.cpp/commit/ced44be34290fab450f8344efa047d8a08e723b4
Thanks

Yes it does!! All our uploads work whereever and there shouldnt be any issues whatsoever. pls let us know if there are though!

This is not true. First, the commit in question was merged after you've uploaded the models.

Then I've checked your model using the --verbose command in llama.cpp server and --jinja.

so the BOS token is missing.

This should be the correct behavior:

(Note I had to put spaces for < sop > since otherwise it wouldn't show here)

shimmyshimmer

Unsloth AI org Apr 30

Hello, this version takes this PR into account?
This PR fixed a template bug but it requires regeneration.
https://github.com/ggml-org/llama.cpp/commit/ced44be34290fab450f8344efa047d8a08e723b4
Thanks

Yes it does!! All our uploads work whereever and there shouldnt be any issues whatsoever. pls let us know if there are though!

This is not true. First, the commit in question was merged after you've uploaded the models.

Then I've checked your model using the --verbose command in llama.cpp server and --jinja.

slot update_slots: id 0 | task 0 | new prompt, n_ctx_slot = 10240, n_keep = 0, n_prompt_tokens = 5
slot update_slots: id 0 | task 0 | prompt token 0: 151333 '< sop >'
slot update_slots: id 0 | task 0 | prompt token 1: 151336 '<|user|>'
slot update_slots: id 0 | task 0 | prompt token 2: 198 '
'
slot update_slots: id 0 | task 0 | prompt token 3: 6023 'hi'
slot update_slots: id 0 | task 0 | prompt token 4: 151337 '<|assistant|>'

so the BOS token is missing.

This should be the correct behavior:

slot update_slots: id 0 | task 0 | prompt token 0: 151331 '[gMASK]'
slot update_slots: id 0 | task 0 | prompt token 1: 151333 '< sop >'
slot update_slots: id 0 | task 0 | prompt token 2: 151336 '<|user|>'
slot update_slots: id 0 | task 0 | prompt token 3: 198 '
'
slot update_slots: id 0 | task 0 | prompt token 4: 14978 'hello'
slot update_slots: id 0 | task 0 | prompt token 5: 151337 '<|assistant|>'

(Note I had to put spaces for < sop > since otherwise it wouldn't show here)

Apologies I got confused with Qwen3 and thought I was replying to Qwen3. We're gonna reupload the models

MrDevolver

Apr 30

@Dampfinchen you don't miss any detail, do you? 😂 Good job watching for bugs! 😉👍

shimmyshimmer

Unsloth AI org May 1

@sovetboga @Dampfinchen @MrDevolver @supernovastar @KeyboardMasher @supernovastar

UPDATE: Should now be all fixed! Feel free to download again! Let us know how it goes. :)

shimmyshimmer changed discussion title from Template bug fixed in llama.cpp to FIXED: Template bug fixed in llama.cpp May 1

urtuuuu

May 1

•

edited May 1

@shimmyshimmer lol, is it me, or they just updated the template again... https://github.com/ggml-org/llama.cpp/commit/e0f572c8466e70d35cbd70ee536ad8fc83b2acac

supernovastar

May 5

@shimmyshimmer lol, is it me, or they just updated the template again... https://github.com/ggml-org/llama.cpp/commit/e0f572c8466e70d35cbd70ee536ad8fc83b2acac
Does this commit require regenerating the GGUF models?

shimmyshimmer

Unsloth AI org Jul 3

•

edited Jul 3 by

danielhanchen

Sorry for the ping guys but just wanted to let you know that we reuploaded the quants btw with more llama.cpp fixes. please use --jinja

Results should be much better so let us know!:

If you don't use --jinja, which applies the chat template, then you will get gibberish!

Results should be much better so let us know!:

./llama.cpp/llama-cli -hf unsloth/GLM-4-32B-0414-GGUF:Q4_K_XL -ngl 99 --jinja

Thank you!

CC: @Dampfinchen @KeyboardMasher @MrDevolver @sovetboga @supernovastar @urtuuuu

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment