metadata
library_name: PyLaia
license: mit
tags:
- PyLaia
- PyTorch
- atr
- htr
- ocr
- historical
- handwritten
metrics:
- CER
- WER
language:
- fr
- la
- it
- oc
- es
datasets:
- CATMuS/medieval
pipeline_tag: image-to-text
PyLaia - CATMuS/medieval
This model performs Handwritten Text Recognition in Latin/Romance on historical documents.
Model description
The model was trained using the PyLaia library on the CATMuS/medieval dataset.
Training images were resized with a fixed height of {dimension} pixels, keeping the original aspect ratio. Vertical lines are discarded.
set | lines |
---|---|
train | 15,2816 |
val | 19,402 |
test | 22,590 |
An external 6-gram character language model can be used to improve recognition. The language model is trained on the text from the CATMuS/medieval training set.
Training Plot
How to use?
Please refer to the PyLaia documentation to use this model.
Demo
https://huggingface.co/spaces/johnlockejrr/yolov11_pylaia_catmus