MelvinW's picture
Added the Block statistics table to Readme
068b1f0 verified
|
raw
history blame
1.89 kB
metadata
license: mit
base_model:
  - magistermilitum/tridis_HTR
library_name: transformers
language:
  - la

Base model: magistermilitum/tridis_HTR v1

Train Lines: ???

Eval Lines: ???

Test Lines: ???

Epochs: 14.1667 / 20

Eval CER: 0.0544

Test CER: 0.0622

Testresults with CERberus

Metric Value
Character Error Rate 6.22
Number of Correct Characters 186998
Number of Substitutions 5425
Number of Insertions 2933
Number of Deletions 3849
Total Character Count 196272
Original Lines Count 2288
Discarded Lines Count 0
Block Count Correct Incorrect Correct Ratio Incorrect Ratio
Digits 0 0 0 nan nan
Lowercase Latin alphabet 154731 147241 7490 95.16 4.84
MUFI Glyphs 0 0 0 nan nan
Punctuation 9 4 5 44.44 55.56
Uppercase Latin alphabet 6883 6450 433 93.71 6.29

Finetuned on an Anglicana-dataset, with mainly Middle Latin and few Middle English and Anglo-Norman text sources containing documents from:

  • the Common Pleas (CP)
  • the Justices (JUST)

from the English Legal Court Rolls.

The model has not been extensively tested.

Errors often occur in the Punctuation, which itself has an error rate of 44.44% which mostly consits of missed ‧ dots.

Potential biases are still to be identified.