Prompt sensitivity problems

#44

by TheBigBlockPC - opened 1 day ago

1 day ago

I use diffusers for my inference and quantize to model to nf4 using bitsandbytes but the model seems to be extremely sensitive to prompt format because short prompts usually result in the model breaking.
My current approach is to use a LLM to enhance the prompts.
Is that sensitivity due to the quantization or is that in the fp16 version too.
I can't really test it due to my lack of RAM and VRAM. I have a 5090 and 64 GB of RAM.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment