metadata
license: mit
base_model:
- magistermilitum/tridis_HTR
library_name: transformers
language:
- la
Base model: magistermilitum/tridis_HTR v1
Train Lines: ???
Eval Lines: ???
Test Lines: ???
Epochs: 14.1667 / 20
Eval CER: 0.0544
Test CER: 0.0622
Testresults with CERberus
Metric | Value |
---|---|
Character Error Rate | 6.22 |
Number of Correct Characters | 186998 |
Number of Substitutions | 5425 |
Number of Insertions | 2933 |
Number of Deletions | 3849 |
Total Character Count | 196272 |
Original Lines Count | 2288 |
Discarded Lines Count | 0 |
Block | Count | Correct | Incorrect | Correct Ratio | Incorrect Ratio |
---|---|---|---|---|---|
Digits | 0 | 0 | 0 | nan | nan |
Lowercase Latin alphabet | 154731 | 147241 | 7490 | 95.16 | 4.84 |
MUFI Glyphs | 0 | 0 | 0 | nan | nan |
Punctuation | 9 | 4 | 5 | 44.44 | 55.56 |
Uppercase Latin alphabet | 6883 | 6450 | 433 | 93.71 | 6.29 |
Finetuned on an Anglicana-dataset, with mainly Middle Latin and few Middle English and Anglo-Norman text sources containing documents from:
- the Common Pleas (CP)
- the Justices (JUST)
from the English Legal Court Rolls.
The model has not been extensively tested.
Errors often occur in the Punctuation, which itself has an error rate of 44.44% which mostly consits of missed ‧ dots.
Potential biases are still to be identified.