Create README.md Albanian ALBERT model pretrained on around 16GB of text (I used uonlp/CulturaX's sq configuration) and 1.1 million training steps, using only the masked language modelling task. Trained on a TPU v4-32 pod, made possible through the Google TPU Research Cloud.

Hyperparameters:

  • Optimizer: LAMB
  • LR: 0.0006
  • β1 \beta_1 : 0.9
  • β2 \beta_2 : 0.999
  • ϵ \epsilon : 1e-8
  • Batch size: 1024
  • Num. steps: 1.1 million
  • dtype: bfloat16
  • max. seq. length: 512

Going to post the model's performance evaluated on different Albanian downstream tasks once I'm done evaluating the model.

Classification Tasks

Task Learning Rate Number of epochs Accuracy Precision Recall F1 score
AlbMoRe[1]. 1e-05 10 0.98 0.97 0.99 0.98

Regression Tasks

TODO

References

[1] Çano, E. (2023). Albmore: A corpus of movie reviews for sentiment analysis in albanian. arXiv preprint arXiv:2306.08526.

Downloads last month
32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train edisnord/albert-base-v2-sq