Spaces:
Running
Running
Multiple zeroGPU calls in same code
#155
by
hen
- opened
I have a space that uses two models for a RAG (Embedding model and LLM). I encapsulated my retreive()
function (using Embedding model) and the llm_inference()
function (using LLM) BOTH with a @spaces.GPU() decorator. Nothing prevents me from doing so, but it seems the space is pretty slow, seems to constantly cold-start between the two models. Any explanation on how @spaces.GPU() works for multiple invocations in the same code is highly appreciated. ! Thanks !
As I understand the documentation, that's the expected syntax. Sometimes Zero is just a bit slow because it's public and has a lot of traffic.
See if your issue persists regardless of time of the day.