FIXED: Template bug fixed in llama.cpp

#4
by sovetboga - opened

Hello, this version takes this PR into account?
This PR fixed a template bug but it requires regeneration.
https://github.com/ggml-org/llama.cpp/commit/ced44be34290fab450f8344efa047d8a08e723b4
Thanks

Hello, this version takes this PR into account?
This PR fixed a template bug but it requires regeneration.
https://github.com/ggml-org/llama.cpp/commit/ced44be34290fab450f8344efa047d8a08e723b4
Thanks

Edit: Disregard my comment I got confused with Qwen3 lol

Hello, this version takes this PR into account?
This PR fixed a template bug but it requires regeneration.
https://github.com/ggml-org/llama.cpp/commit/ced44be34290fab450f8344efa047d8a08e723b4
Thanks

Yes it does!! All our uploads work whereever and there shouldnt be any issues whatsoever. pls let us know if there are though!

This is not true. First, the commit in question was merged after you've uploaded the models.

Then I've checked your model using the --verbose command in llama.cpp server and --jinja.

slot update_slots: id 0 | task 0 | new prompt, n_ctx_slot = 10240, n_keep = 0, n_prompt_tokens = 5
slot update_slots: id 0 | task 0 | prompt token 0: 151333 '< sop >'
slot update_slots: id 0 | task 0 | prompt token 1: 151336 '<|user|>'
slot update_slots: id 0 | task 0 | prompt token 2: 198 '
'
slot update_slots: id 0 | task 0 | prompt token 3: 6023 'hi'
slot update_slots: id 0 | task 0 | prompt token 4: 151337 '<|assistant|>'

so the BOS token is missing.

This should be the correct behavior:

slot update_slots: id 0 | task 0 | prompt token 0: 151331 '[gMASK]'
slot update_slots: id 0 | task 0 | prompt token 1: 151333 '< sop >'
slot update_slots: id 0 | task 0 | prompt token 2: 151336 '<|user|>'
slot update_slots: id 0 | task 0 | prompt token 3: 198 '
'
slot update_slots: id 0 | task 0 | prompt token 4: 14978 'hello'
slot update_slots: id 0 | task 0 | prompt token 5: 151337 '<|assistant|>'

(Note I had to put spaces for < sop > since otherwise it wouldn't show here)

Unsloth AI org

Hello, this version takes this PR into account?
This PR fixed a template bug but it requires regeneration.
https://github.com/ggml-org/llama.cpp/commit/ced44be34290fab450f8344efa047d8a08e723b4
Thanks

Yes it does!! All our uploads work whereever and there shouldnt be any issues whatsoever. pls let us know if there are though!

This is not true. First, the commit in question was merged after you've uploaded the models.

Then I've checked your model using the --verbose command in llama.cpp server and --jinja.

slot update_slots: id 0 | task 0 | new prompt, n_ctx_slot = 10240, n_keep = 0, n_prompt_tokens = 5
slot update_slots: id 0 | task 0 | prompt token 0: 151333 '< sop >'
slot update_slots: id 0 | task 0 | prompt token 1: 151336 '<|user|>'
slot update_slots: id 0 | task 0 | prompt token 2: 198 '
'
slot update_slots: id 0 | task 0 | prompt token 3: 6023 'hi'
slot update_slots: id 0 | task 0 | prompt token 4: 151337 '<|assistant|>'

so the BOS token is missing.

This should be the correct behavior:

slot update_slots: id 0 | task 0 | prompt token 0: 151331 '[gMASK]'
slot update_slots: id 0 | task 0 | prompt token 1: 151333 '< sop >'
slot update_slots: id 0 | task 0 | prompt token 2: 151336 '<|user|>'
slot update_slots: id 0 | task 0 | prompt token 3: 198 '
'
slot update_slots: id 0 | task 0 | prompt token 4: 14978 'hello'
slot update_slots: id 0 | task 0 | prompt token 5: 151337 '<|assistant|>'

(Note I had to put spaces for < sop > since otherwise it wouldn't show here)

Apologies I got confused with Qwen3 and thought I was replying to Qwen3. We're gonna reupload the models

@Dampfinchen you don't miss any detail, do you? 😂 Good job watching for bugs! 😉👍

Unsloth AI org

@sovetboga @Dampfinchen @MrDevolver @supernovastar @KeyboardMasher @supernovastar

UPDATE: Should now be all fixed! Feel free to download again! Let us know how it goes. :)

shimmyshimmer changed discussion title from Template bug fixed in llama.cpp to FIXED: Template bug fixed in llama.cpp

@shimmyshimmer lol, is it me, or they just updated the template again... https://github.com/ggml-org/llama.cpp/commit/e0f572c8466e70d35cbd70ee536ad8fc83b2acac
Does this commit require regenerating the GGUF models?

Sign up or log in to comment