Papers
arxiv:2203.11008

Transformer-based HTR for Historical Documents

Published on Mar 21, 2022
Authors:
,
,
,

Abstract

We apply the TrOCR framework to real-world, historical manuscripts and show that TrOCR per se is a strong model, ideal for transfer learning. TrOCR has been trained on English only, but it can adapt to other languages that use the Latin alphabet fairly easily and with little training material. We compare TrOCR against a SOTA HTR framework (Transkribus) and show that it can beat such systems. This finding is essential since Transkribus performs best when it has access to baseline information, which is not needed at all to fine-tune TrOCR.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2203.11008 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2203.11008 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2203.11008 in a Space README.md to link it from this page.

Collections including this paper 2