--- license: mit tags: - flair - text-generation --- # Turmbücher LM This repository contains the language models (forward & backward) that were used to train the [Turmbücher NER](https://huggingface.co/dh-unibe/turmbuecher-ner-v1/edit/main/README.md). Two models for premodern German trained by Ismail Prada Ziegler as part of a research project at the University of Bern, Digital Humanities. We recommend using flairs stacked embeddings for best effect. ## Data Set Main data set: [Berner Turmbücher](https://www.polit-forum-bern.ch/turmbuecher/), early volumes from 16th C., Early New High German, 61k tokens training data. Secondary data sets: - [SSRQ](https://www.ssrq-sds-fds.ch/home/) - Fribourg, 59k tokens. - [Chorgerichtsmanuale](https://www.adfontes.uzh.ch/370540/training/deutsche-transkriptionsuebungen/chorgerichtsmanuale-einleitung) (unpublished), 76k tokens. - [Königsfelden Charters](https://www.koenigsfelden.uzh.ch/), 623k tokens. - Talgerichtsprotokolle (unpublished), 438k tokens.