Batch inference
#4
by
Pavelrst
- opened
Hey, quick question - is this model supposed to run faster on GPU when batch_size
> 1?
I've tried to run it with batch_size
= 2,4,8 and measure the time of .forward
but always got slightly slower inference.
Any idea why?
i don't know to be honest, depends on your gpu vram i guess