low max position embeddings?
#10
by
ctranslate2-4you
- opened
Am I correct in understanding that the maximum grid size for position embeddings is 64x64? Why so low? With a patch size of 14x14x pixels, this effectively forces all images with a resolution of 896x896 to be resized. Moreover, even if only one dimension (e.g. the height) exceeds 896 pixels, the other dimension will nevertheless be resized proportionally...I'm confused how it's supposedly getting so high of marks on OCR or image recognition, for example? I've tested it and it's mediocre...
I meant to say "forces all images with a resolution greater than 896x896"