inference speed

#7
by nilx21 - opened

I've fine tuned this model on a custom dataset using LoRA then merged the weights by setting save_merged_lora_model=True.
When I tried to do an inference using the fine-tuned model, I've noticed the inference speed is really slow.
Would you have some ideas on why this happens?

Sign up or log in to comment