GGUF (ollama) version are far from API version

#20
by papipsycho - opened

Hello Guys,

I'm testing the GGUF version with ollama, but for now i'm getting result far from API version,

i try to tweak different kind setting such like temperature, top_p, i also try different version of quantize 5 to 8, but i'm not able to get the quality of the API version

do you have any idea what should i change to increase the quality ?

It's just spewing nonsense after some tokens.

@papipsycho & @egigoka Could you share some prompts? I am implementing this in chatllm.cpp, and want to see if I do it correctly.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment