Faster inference
#1
by
soujanyaporia
- opened
This is a very good work! Thanks so much.
- Could we have the option to specify how many samples the user wants to generate instead of always generating a fixed number of samples per prompt?
- Could we make the inference faster by using a different scheduler that does not need 100 steps for inference, and by using flash attention?
Thank you!