unknown pre-tokenizer type: 'deepseek-r1-qwen'
What version of llama.cpp did you use? I get error:
llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'deepseek-r1-qwen'
with latest version:
From https://github.com/ggerganov/llama.cpp
[new tag] b4516 -> b4516
llama distill works, only qwen distills (from this repo) throw this. I tried with DeepSeek-R1-Distill-Qwen-14B-Q8_0.gguf
Make sure you are using the latest version of llama.cpp: https://unsloth.ai/blog/deepseek-r1
There was an error where Qwen was not working but llama.cpp pushed in an update like 3 hours ago or so
Yes, this is why I wrote one I used - b4516. It was 6 minutes old when I wrote. There is now even newer version (b4518 - 1 hour ago), but I doubt this was the issue 3 hours ago.
Did you or could you try this model: DeepSeek-R1-Distill-Qwen-14B-Q8_0.gguf and write the version of llama.cpp you use?
Thank you!
Yes, this is why I wrote one I used - b4516. It was 6 minutes old when I wrote. There is now even newer version (b4518 - 1 hour ago), but I doubt this was the issue 3 hours ago.
Did you or could you try this model: DeepSeek-R1-Distill-Qwen-14B-Q8_0.gguf and write the version of llama.cpp you use?
Thank you!
Interesting youre still having the error - we have tested it ourselves and it runs. Unsure what's wrong on your end :(
Thank you for answer. I'll try compiling latest version and try again as I got same error with other (Bartowski one) GGUF, and you say it works for you, so it must be something else and not the quantized model.
Any idea how to configure into Cursor app (win 11 64b)? Would be nice to have in there for my agents to reference
Just FYI update, newer versions of llama.cpp build executables (llama-server, llama-cli, ...) in /llama.cpp/build/bin. That was the issue on my side. Found out only when I (after 2 git pull -> build) deleted llama.cpp completely, cloned the repo again and did full build again. Then I found that there is no llama-server in root folder anymore :)