Hey, quick question - is this model supposed to run faster on GPU when batch_size > 1?I've tried to run it with batch_size = 2,4,8 and measure the time of .forward but always got slightly slower inference.Any idea why?
batch_size
.forward
· Sign up or log in to comment