BETA Historical Swedish Donut
This model extends the base training of naver-clova-ix/donut-base with a "learn to read" training phase focused on historical handwritten Swedish. It has been trained on transcribing paragraphs of 1-15 lines of handwritten text sourced from documents from the period 1600-1900. The model needs to be finetuned for downstream use.
This model is still under development.
Known issues
The model has a tendency to produce empty transcriptions of shorter paragraphs (1-5 lines).
Training data
The training data was sourced from Riksarkivet's HTR training data (most of which can be found here on HuggingFace) and the Norhand v3 dataset.
- Downloads last month
- 189
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support