Model Description

Fine-tuning of XLM-RoBERTa-Uk model on Ukrainian texts to recover punctuation and case.

How to Use

Download script get_predictions.py from the repository.

from transformers import AutoTokenizer, AutoModelForTokenClassification
from get_predictions import recover_text

tokenizer = AutoTokenizer.from_pretrained('ukr-models/uk-punctcase')
model = AutoModelForTokenClassification.from_pretrained('ukr-models/uk-punctcase')

text = "..."
recover_text(text_processed, model, tokenizer)
Downloads last month
34
Safetensors
Model size
109M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.