iprada commited on
Commit
a14babd
1 Parent(s): 5951564

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -3
README.md CHANGED
@@ -3,7 +3,22 @@ license: mit
3
  tags:
4
  - flair
5
  - text-generation
6
- widget:
7
- - text: "My name is Julien and I like to"
8
- example_title: "Julien"
9
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  tags:
4
  - flair
5
  - text-generation
 
 
 
6
  ---
7
+
8
+ # Turmbücher LM
9
+
10
+ This repository contains the language models (forward & backward) that were used to train the [Turmbücher NER](https://huggingface.co/dh-unibe/turmbuecher-ner-v1/edit/main/README.md).
11
+
12
+ Two models for premodern German trained by Ismail Prada Ziegler as part of a research project at the University of Bern, Digital Humanities.
13
+
14
+ We recommend using flairs stacked embeddings for best effect.
15
+
16
+ ## Data Set
17
+
18
+ Main data set: [Berner Turmbücher](https://www.polit-forum-bern.ch/turmbuecher/), early volumes from 16th C., Early New High German, 61k tokens training data.
19
+
20
+ Secondary data sets:
21
+ - [SSRQ](https://www.ssrq-sds-fds.ch/home/) - Fribourg, 59k tokens.
22
+ - [Chorgerichtsmanuale](https://www.adfontes.uzh.ch/370540/training/deutsche-transkriptionsuebungen/chorgerichtsmanuale-einleitung) (unpublished), 76k tokens.
23
+ - [Königsfelden Charters](https://www.koenigsfelden.uzh.ch/), 623k tokens.
24
+ - Talgerichtsprotokolle (unpublished), 438k tokens.