Model can't produce certain character pairs - broken tokenization?
#9
by
CyberShadowMD
- opened
Try this completion:
One!
Two!
Three!
Four!
It should suggest "Five!" but it just cannot produce a ! followed by a newline.
Other character sequences have this problem as well. It makes this model unusable for certain programming languages.
Running deepseek-coder-33b-instruct.Q4_K_M.gguf
under llama.cpp (tried many versions)...