Text Generation
Flair
turmbuecher-lm-v1 / README.md
iprada's picture
Update README.md
a14babd
---
license: mit
tags:
- flair
- text-generation
---
# Turmbücher LM
This repository contains the language models (forward & backward) that were used to train the [Turmbücher NER](https://huggingface.co/dh-unibe/turmbuecher-ner-v1/edit/main/README.md).
Two models for premodern German trained by Ismail Prada Ziegler as part of a research project at the University of Bern, Digital Humanities.
We recommend using flairs stacked embeddings for best effect.
## Data Set
Main data set: [Berner Turmbücher](https://www.polit-forum-bern.ch/turmbuecher/), early volumes from 16th C., Early New High German, 61k tokens training data.
Secondary data sets:
- [SSRQ](https://www.ssrq-sds-fds.ch/home/) - Fribourg, 59k tokens.
- [Chorgerichtsmanuale](https://www.adfontes.uzh.ch/370540/training/deutsche-transkriptionsuebungen/chorgerichtsmanuale-einleitung) (unpublished), 76k tokens.
- [Königsfelden Charters](https://www.koenigsfelden.uzh.ch/), 623k tokens.
- Talgerichtsprotokolle (unpublished), 438k tokens.