|
--- |
|
license: mit |
|
base_model: |
|
- magistermilitum/tridis_HTR |
|
library_name: transformers |
|
language: |
|
- la |
|
--- |
|
Base model: **magistermilitum/tridis_HTR v1** |
|
|
|
Train Lines: ??? |
|
|
|
Eval Lines: ??? |
|
|
|
Test Lines: ??? |
|
|
|
|
|
Epochs: 14.1667 / 20 |
|
|
|
Eval CER: 0.0544 |
|
|
|
Test CER: 0.0622 |
|
|
|
|
|
Testresults with CERberus |
|
| Metric | Value | |
|
|----------------------------|---------| |
|
| Character Error Rate | 6.22 | |
|
| Number of Correct Characters| 186998 | |
|
| Number of Substitutions | 5425 | |
|
| Number of Insertions | 2933 | |
|
| Number of Deletions | 3849 | |
|
| Total Character Count | 196272 | |
|
| Original Lines Count | 2288 | |
|
| Discarded Lines Count | 0 | |
|
|
|
| Block | Count | Correct | Incorrect | Correct Ratio | Incorrect Ratio | |
|
|------------------------------|---------|-----------|-------------|-----------------|-------------------| |
|
| Digits | 0 | 0 | 0 | nan | nan | |
|
| Lowercase Latin alphabet | 154731 | 147241 | 7490 | 95.16 | 4.84 | |
|
| MUFI Glyphs | 0 | 0 | 0 | nan | nan | |
|
| Punctuation | 9 | 4 | 5 | 44.44 | 55.56 | |
|
| Uppercase Latin alphabet | 6883 | 6450 | 433 | 93.71 | 6.29 | |
|
|
|
|
|
|
|
Finetuned on an Anglicana-dataset, with mainly Middle Latin and few Middle English and Anglo-Norman text sources containing documents from: |
|
|
|
- the Common Pleas (CP) |
|
- the Justices (JUST) |
|
|
|
from the English Legal Court Rolls. |
|
|
|
The model has not been extensively tested. |
|
|
|
Errors often occur in the Punctuation, which itself has an error rate of 44.44% which mostly consits of missed ‧ dots. |
|
|
|
Potential biases are still to be identified. |