---
library_name: transformers
---

# BETA Historical Swedish Donut

This model extends the base training of [naver-clova-ix/donut-base](https://huggingface.co/naver-clova-ix/donut-base) with a "learn to read" training phase focused on historical handwritten Swedish. It has been trained on transcribing paragraphs of 1-15 lines of handwritten text sourced from documents from the period 1600-1900. The model needs to be finetuned for downstream use.

This model is still under development.

## Known issues

The model has a tendency to produce empty transcriptions of shorter paragraphs (1-5 lines).

## Training data

The training data was sourced from Riksarkivet's HTR training data (most of which can be found here on HuggingFace) and the Norhand v3 dataset.