Teklia
/

pylaia-norhand-v1

Model card Files Files and versions Community

starride-teklia commited on Feb 1, 2024

Commit

21ab0a3

·

verified ·

1 Parent(s): 0427abc

Update README.md

Files changed (1) hide show

README.md +23 -16

README.md CHANGED Viewed

@@ -18,28 +18,26 @@ This model performs Handwritten Text Recognition in Norwegian. It was developed
 ## Model description
-The model has been trained using the PyLaia library on the [NorHand](https://zenodo.org/record/6542056) document images.
 Training images were resized with a fixed height of 128 pixels, keeping the original aspect ratio.
-## Evaluation results
-The model achieves the following results:
-| set   | CER (%)    | WER (%)   |
-| ----- | ---------- | --------- |
-| train | 2.17       | 7.65     |
-| val   | 8.78       | 24.93     |
-| test  | 7.94       | 24.04     |
-Results improve on validation and test sets when PyLaia is combined with a 6-gram language model.
-The language model is trained on [this text corpus](https://www.nb.no/sprakbanken/en/resource-catalogue/oai-nb-no-sbr-73/) published by the National Library of Norway.
-| set   | CER (%)    | WER (%)   |
-| ----- | ---------- | --------- |
-| train | 2.40       | 8.10      |
-| val   | 7.45       | 19.75     |
-| test  | 6.55       | 18.2      |
 ## How to use
@@ -48,6 +46,15 @@ Please refer to the [documentation](https://atr.pages.teklia.com/pylaia/).
 # Cite us!
 ```bibtex
 @inproceedings{10.1007/978-3-031-06555-2_27,
 author = {Maarand, Martin and Beyer, Yngvil and K\r{a}sen, Andre and Fosseide, Knut T. and Kermorvant, Christopher},

 ## Model description
+The model has been trained using the PyLaia library on the [NorHand v1](https://zenodo.org/record/6542056) document images.
 Training images were resized with a fixed height of 128 pixels, keeping the original aspect ratio.
+| split | N horizontal lines |
+| ----- | ------: |
+| train | 19,653  |
+| val   |  2,286  |
+| test  |  1,793  |
+An external 6-gram character language model can be used to improve recognition. The language model is trained on the text from the NorHand v1 training set.
+## Evaluation results
+The model achieves the following results:
+| set   | Language model | CER (%)    | WER (%) | N lines   |
+|:------|:---------------|:----------:|:-------:|----------:|
+| test  | no             |  7.94      |   24.04 |     1,793 |
+| test  | yes            |  6.55      |   18.20 |     1,793 |
 ## How to use
 # Cite us!
+```bibtex
+@inproceedings{pylaia-lib,
+    author = "Tarride, Solène and Schneider, Yoann and Generali, Marie and Boillet, Melodie and Abadie, Bastien and Kermorvant, Christopher",
+    title = "Improving Automatic Text Recognition with Language Models in the PyLaia Open-Source Library",
+    booktitle = "Submitted at ICDAR2024",
+    year = "2024"
+}
+```
 ```bibtex
 @inproceedings{10.1007/978-3-031-06555-2_27,
 author = {Maarand, Martin and Beyer, Yngvil and K\r{a}sen, Andre and Fosseide, Knut T. and Kermorvant, Christopher},