|
--- |
|
license: bsd |
|
language: |
|
- sk |
|
base_model: |
|
- FacebookAI/roberta-base |
|
--- |
|
|
|
# Slovak Roberta Base |
|
|
|
A monolingual Slovak language model. |
|
|
|
Model was trained on a collection of Slovak web pages from various sources. |
|
|
|
## Training parameters |
|
|
|
We used 4 x A100 40GB GPU for 14 hours. |
|
|
|
- Effective batch size: 192 |
|
- Sequence length 512 |
|
- Training Steps 120 000. |
|
- warmup_steps 1000 |
|
- optimizer adamw |
|
- Per device batch size 48 |
|
- mixed_precision bf16 |
|
- weight decay 0.01 |
|
- gradient clipping 1.0 |
|
- learning_rate 1e-5 |
|
- scheduler cosine |
|
|