File size: 1,329 Bytes
58feae1
 
3740ebf
 
 
 
 
 
 
 
 
 
 
 
 
58feae1
6255499
3740ebf
a2e757c
3740ebf
26fb81b
 
 
108ad3f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
---
license: mit
language:
- de
metrics:
- cer
library_name: transformers
tags:
- kurrent
- ocr
- htr
- 16th century
- 17th century
- 18th century
- trocr
---
# TrOCR Kurrent-Model 16th to 18th century

Base model: **dh-unibe/trocr-kurrent** 

Epochs: 19.85 / 20  
Eval CER: 0.05673  
Test CER: 0.05416  

This model is based on an extensive training set (of roughly 1579200 words) and evaluated against the same hands in an evaluation and test set (automatic split).
Consisting of German Kurrent scripts written in the 16th-18th century. 

The ground truth stems from different projects and partners and is biased toward Swiss documents. 
It is based on documents from a variety of archives and projects. 
Among others, the State Archives of Zürich (Stillstandsprotokolle, Ratsmanuale, Findmittel), and the scholarly edition project Königsfelden (Universitäten Zürich und Bern: www.koenigsfelden.uzh.ch).
As well as transcriptions from Einsiedeln.
Further contributions by the university archives of Greifswald: https://rechtsprechung-im-ostseeraum.archiv.uni-greifswald.de/.

The public Transkribus model (based on PyLaia) can be found here: https://readcoop.eu/model/german-kurrent-16th-18th/

Extensive testing of the model has still to be carried out.
This is only a first attempt but might help for fine-tuning tasks.