Edit model card

Next word prediction in 103 languages. Give it a sentence in another language, and replace one of the words with "[MASK]". Works with English too, obviously, but that defeats the point of the demo.

distilbert-base-multilingual-cased finetuned on 50,000 examples from r/explainlikeimfive subset of ELI5 dataset, for English causal language modelling. All knowledge of target languages is acquired from pretraining only.

Hyperparameters: epochs 3, learning rate 2e-5, batch size 8, weight decay 0.01, optimizer Adam with betas=(0.9,0.999) and epsilon=1e-08.

Final model perplexity 10.22

Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for zzzotop/zero-shot-cross-lingual-transfer-demo-masked

Finetuned
(196)
this model