This is the Finnish GPT3 XL model (https://huggingface.co/TurkuNLP/gpt3-finnish-xl/) finetuned for simplification. Finetuning was done according to these instructions: https://github.com/spyysalo/instruction-finetune. The instruction for simplification is "Mukauta selkosuomeksi\n\n".

Paper: Towards Automatic Finnish Text Simplification (Dmitrieva & Tiedemann, DeTermIt-WS 2024).

The finetuning data can be obtained here: http://urn.fi/urn:nbn:fi:lb-2024011703. If you wish to replicate the results, you can find the training, validation, and testing sentence pairs' ids in the "splits.zip" archive in this repository. The ids contain the following information: "{regular text id}__{simple text id}__{sentence pair number}".

Downloads last month
52
Safetensors
Model size
1.5B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.