Model Card
This is a domain-adapted version ofdbmdz/bert-base-turkish-cased
.
We continued masked-language pre-training on the open-sourceyeniguno/turkish_agriculture_corpus
to bias the model toward Turkish agricultural vocabulary and discourse while retaining its general-language abilities.
How to Get Started with the Model
import torch
from transformers import AutoModelForMaskedLM, AutoTokenizer
model_checkpoint = "yeniguno/bert-turkish-agriculture-mlm"
model = AutoModelForMaskedLM.from_pretrained(model_checkpoint)
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
text = "Sabah kahvaltıda babam, köyde bu hafta [MASK] hazırlığının başlayacağını söyledi."
inputs = tokenizer(text, return_tensors="pt")
token_logits = model(**inputs).logits
mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1]
mask_token_logits = token_logits[0, mask_token_index, :]
# Pick the [MASK] candidates with the highest logits
top_5_tokens = torch.topk(mask_token_logits, 5, dim=1).indices[0].tolist()
for token in top_5_tokens:
print(f"'>>> {text.replace(tokenizer.mask_token, tokenizer.decode([token]))}'")
- Downloads last month
- 19
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for yeniguno/bert-turkish-agriculture-mlm
Base model
dbmdz/bert-base-turkish-cased