Fix the bug in generation config
#11
by
SivilTaram
- opened
If we follow the default generation configuration, it does not utilize the key-value cache during inference. This can cause the model to be too slow to generate text efficiently.
Thank you very much for the fix.
RaymondAISG
changed pull request status to
merged