Create README.md
Albanian ALBERT model pretrained on around 16GB of text (I used uonlp/CulturaX's sq
configuration) and 1.1 million training steps, using only the masked language modelling task. Trained on a TPU v4-32 pod, made possible through the Google TPU Research Cloud.
Hyperparameters:
- Optimizer: LAMB
- LR: 0.0006
- : 0.9
- : 0.999
- : 1e-8
- Batch size: 1024
- Num. steps: 1.1 million
- dtype: bfloat16
- max. seq. length: 512
Going to post the model's performance evaluated on different Albanian downstream tasks once I'm done evaluating the model.
Classification Tasks
Task | Learning Rate | Number of epochs | Accuracy | Precision | Recall | F1 score |
---|---|---|---|---|---|---|
AlbMoRe[1]. | 1e-05 | 10 | 0.98 | 0.97 | 0.99 | 0.98 |
Regression Tasks
TODO
References
[1] Çano, E. (2023). Albmore: A corpus of movie reviews for sentiment analysis in albanian. arXiv preprint arXiv:2306.08526.
- Downloads last month
- 32
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support