About infer time
Example query and documents
import time
query = "slm markdown"
documents = [
"https://raw.githubusercontent.com/jina-ai/multimodal-reranker-test/main/handelsblatt-preview.png",
# "https://raw.githubusercontent.com/jina-ai/multimodal-reranker-test/main/paper-11.png"
]
construct sentence pairs
image_pairs = [[doc[0], doc] for doc in documents]
start1 = time.time()
scores1 = model.compute_score(image_pairs, max_length=2048, batch_size=1, doc_type="image")
end1 = time.time()
start2 = time.time()
scores2 = model.compute_score(image_pairs, max_length=2048, batch_size=2, doc_type="image")
end2 = time.time()
start3 = time.time()
scores3 = model.compute_score(image_pairs, max_length=2048, batch_size=1, doc_type="image")
end3 = time.time()
print(end1-start1,end2-start2,end3-start3)
print(scores1,scores2,scores3)
Why is batch=1 the fastest inference speed? I need more throughput, do you have any recommendations? Thanks!
Yes
I would suggest using local image instead. The image downloading time should also be considered in your case I think.
I tested with local images, and the situation I described is only about model inference time.