Very sensitve to any repetition penalty!

#52

by jukofyork - opened Apr 13, 2024

Discussion

jukofyork

Apr 13, 2024

•

edited Apr 13, 2024

Just in case anybody tries to use the quantized GGUF files with llama.cpp that had the DBRX PR merged in today:

Definitely make sure you reduce the repetition penalty down from the default!

Even with 1.05 is does all sorts of strange stuff like stopping mid-distance, etc and with a default of 1.1 or 1.2 it's hilariously lazy and bad! (see my post at the bottom of the llama.cpp PR).

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment