|
--- |
|
widget: |
|
- text: Simon dog i <mask> i går. |
|
license: mit |
|
datasets: |
|
- ChangeIsKey/kubhist2 |
|
language: |
|
- sv |
|
library_name: transformers |
|
--- |
|
|
|
This is a roberta model trained on kubhist2 (https://spraakbanken.gu.se/en/resources/kubhist2, https://spraakbanken.gu.se/blogg/index.php/2019/09/15/the-kubhist-corpus-of-swedish-newspapers/). For a HF version of kubhist2, see here: https://huggingface.co/datasets/ChangeIsKey/kubhist2 |
|
|
|
This is a work in progress, the quality of the model -- just like the quality of the training data -- is far from great. |
|
|
|
Shared here with no guarantee whatsoever, will likely change, use at your own risk, etc. |
|
|
|
### Discussion of Biases |
|
This is trained on historical data. As such, outdated views might be present in the data. |
|
|
|
### Other Known Limitations |
|
The data comes from an OCR process. The text is thus not perfect, especially so in the earlier decades. |
|
|
|
### Contact |
|
Simon Hengchen, [iguanodon.ai](https://iguanodon.ai) |