edisnord/albert-base-v2-sq

Create README.md Albanian ALBERT model pretrained on around 16GB of text (I used uonlp/CulturaX's sq configuration) and 1.1 million training steps, using only the masked language modelling task. Trained on a TPU v4-32 pod, made possible through the Google TPU Research Cloud.

Hyperparameters:

Optimizer: LAMB
LR: 0.0006
$\beta_1$ : 0.9
$\beta_2$ : 0.999
$\epsilon$ : 1e-8
Batch size: 1024
Num. steps: 1.1 million
dtype: bfloat16
max. seq. length: 512

Going to post the model's performance evaluated on different Albanian downstream tasks once I'm done evaluating the model.

Classification Tasks

Task	Learning Rate	Number of epochs	Accuracy	Precision	Recall	F1 score
AlbMoRe[1].	1e-05	10	0.98	0.97	0.99	0.98

Regression Tasks

TODO

References

[1] Çano, E. (2023). Albmore: A corpus of movie reviews for sentiment analysis in albanian. arXiv preprint arXiv:2306.08526.

edisnord
/

albert-base-v2-sq

Classification Tasks

Regression Tasks

References

Dataset used to train edisnord/albert-base-v2-sq