FIXED: Template bug fixed in llama.cpp
Hello, this version takes this PR into account?
This PR fixed a template bug but it requires regeneration.
https://github.com/ggml-org/llama.cpp/commit/ced44be34290fab450f8344efa047d8a08e723b4
Thanks
Hello, this version takes this PR into account?
This PR fixed a template bug but it requires regeneration.
https://github.com/ggml-org/llama.cpp/commit/ced44be34290fab450f8344efa047d8a08e723b4
Thanks
Edit: Disregard my comment I got confused with Qwen3 lol
Hello, this version takes this PR into account?
This PR fixed a template bug but it requires regeneration.
https://github.com/ggml-org/llama.cpp/commit/ced44be34290fab450f8344efa047d8a08e723b4
ThanksYes it does!! All our uploads work whereever and there shouldnt be any issues whatsoever. pls let us know if there are though!
This is not true. First, the commit in question was merged after you've uploaded the models.
Then I've checked your model using the --verbose command in llama.cpp server and --jinja.
slot update_slots: id 0 | task 0 | new prompt, n_ctx_slot = 10240, n_keep = 0, n_prompt_tokens = 5
slot update_slots: id 0 | task 0 | prompt token 0: 151333 '< sop >'
slot update_slots: id 0 | task 0 | prompt token 1: 151336 '<|user|>'
slot update_slots: id 0 | task 0 | prompt token 2: 198 '
'
slot update_slots: id 0 | task 0 | prompt token 3: 6023 'hi'
slot update_slots: id 0 | task 0 | prompt token 4: 151337 '<|assistant|>'
so the BOS token is missing.
This should be the correct behavior:
slot update_slots: id 0 | task 0 | prompt token 0: 151331 '[gMASK]'
slot update_slots: id 0 | task 0 | prompt token 1: 151333 '< sop >'
slot update_slots: id 0 | task 0 | prompt token 2: 151336 '<|user|>'
slot update_slots: id 0 | task 0 | prompt token 3: 198 '
'
slot update_slots: id 0 | task 0 | prompt token 4: 14978 'hello'
slot update_slots: id 0 | task 0 | prompt token 5: 151337 '<|assistant|>'
(Note I had to put spaces for < sop > since otherwise it wouldn't show here)
Hello, this version takes this PR into account?
This PR fixed a template bug but it requires regeneration.
https://github.com/ggml-org/llama.cpp/commit/ced44be34290fab450f8344efa047d8a08e723b4
ThanksYes it does!! All our uploads work whereever and there shouldnt be any issues whatsoever. pls let us know if there are though!
This is not true. First, the commit in question was merged after you've uploaded the models.
Then I've checked your model using the --verbose command in llama.cpp server and --jinja.
slot update_slots: id 0 | task 0 | new prompt, n_ctx_slot = 10240, n_keep = 0, n_prompt_tokens = 5
slot update_slots: id 0 | task 0 | prompt token 0: 151333 '< sop >'
slot update_slots: id 0 | task 0 | prompt token 1: 151336 '<|user|>'
slot update_slots: id 0 | task 0 | prompt token 2: 198 '
'
slot update_slots: id 0 | task 0 | prompt token 3: 6023 'hi'
slot update_slots: id 0 | task 0 | prompt token 4: 151337 '<|assistant|>'so the BOS token is missing.
This should be the correct behavior:
slot update_slots: id 0 | task 0 | prompt token 0: 151331 '[gMASK]'
slot update_slots: id 0 | task 0 | prompt token 1: 151333 '< sop >'
slot update_slots: id 0 | task 0 | prompt token 2: 151336 '<|user|>'
slot update_slots: id 0 | task 0 | prompt token 3: 198 '
'
slot update_slots: id 0 | task 0 | prompt token 4: 14978 'hello'
slot update_slots: id 0 | task 0 | prompt token 5: 151337 '<|assistant|>'(Note I had to put spaces for < sop > since otherwise it wouldn't show here)
Apologies I got confused with Qwen3 and thought I was replying to Qwen3. We're gonna reupload the models
@sovetboga @Dampfinchen @MrDevolver @supernovastar @KeyboardMasher @supernovastar
UPDATE: Should now be all fixed! Feel free to download again! Let us know how it goes. :)
@shimmyshimmer lol, is it me, or they just updated the template again... https://github.com/ggml-org/llama.cpp/commit/e0f572c8466e70d35cbd70ee536ad8fc83b2acac
@shimmyshimmer lol, is it me, or they just updated the template again... https://github.com/ggml-org/llama.cpp/commit/e0f572c8466e70d35cbd70ee536ad8fc83b2acac
Does this commit require regenerating the GGUF models?