Very sensitve to any repetition penalty!
#52
by
jukofyork
- opened
Just in case anybody tries to use the quantized GGUF files with llama.cpp that had the DBRX PR merged in today:
Definitely make sure you reduce the repetition penalty down from the default!
Even with 1.05 is does all sorts of strange stuff like stopping mid-distance, etc and with a default of 1.1 or 1.2 it's hilariously lazy and bad! (see my post at the bottom of the llama.cpp PR).