dhladek commited on
Commit
9d999f1
·
verified ·
1 Parent(s): 25ba7b8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -12,3 +12,18 @@ A monolingual Slovak language model.
12
 
13
  Model was trained on a collection of Slovak web pages from various sources.
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  Model was trained on a collection of Slovak web pages from various sources.
14
 
15
+ ## Training parameters
16
+
17
+ We used 4 x A100 40GB GPU for 14 hours.
18
+
19
+ - Effective batch size: 192
20
+ - Sequence length 512
21
+ - Training Steps 120 000.
22
+ - warmup_steps 1000
23
+ - optimizer adamw
24
+ - Per device batch size 48
25
+ - mixed_precision bf16
26
+ - weight decay 0.01
27
+ - gradient clipping 1.0
28
+ - learning_rate 1e-5
29
+ - scheduler cosine