GPU requirements of this model
- opened
What are the minimum GPU requirements to run the model and can someone give advice for a good performing model? I want to have a nice user-experience for end-users (for a chatbot application). A too high latency would signifcantly affect user experience, which i would like to avoid.
Here we can see some example reranker usage:
The person in this example used the A-10 with pretty good results. I would be interested if there is a cheaper, equally strong alternative to this GPU? I would like to have similar latency times of around 1 second like presented with the A10.
Thanks for any hints and tips.