Zerpal
Collection
The largest open-source Udmurt monolingual corpora and pre-trained language models
•
14 items
•
Updated
•
1
You can use this model directly with a pipeline for masked language modeling:
from transformers import pipeline
unmasker = pipeline('fill-mask', model='udmurtNLP/zerpal-glot500', tokenizer='cis-lmu/glot500-base')
unmasker("Ӟечбур! Мынам нимы <mask>.")
Here is how to use this model to get the features of a given text in PyTorch:
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained('cis-lmu/glot500-base')
model = AutoModelForMaskedLM.from_pretrained("udmurtNLP/zerpal-glot500")
text = "Яратон, яратон, мар меда сыӵе тон?"
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)