2023-10-16 19:31:53,657 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:31:53,658 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 19:31:53,658 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:31:53,658 MultiCorpus: 1085 train + 148 dev + 364 test sentences - NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator 2023-10-16 19:31:53,658 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:31:53,659 Train: 1085 sentences 2023-10-16 19:31:53,659 (train_with_dev=False, train_with_test=False) 2023-10-16 19:31:53,659 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:31:53,659 Training Params: 2023-10-16 19:31:53,659 - learning_rate: "5e-05" 2023-10-16 19:31:53,659 - mini_batch_size: "4" 2023-10-16 19:31:53,659 - max_epochs: "10" 2023-10-16 19:31:53,659 - shuffle: "True" 2023-10-16 19:31:53,659 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:31:53,659 Plugins: 2023-10-16 19:31:53,659 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 19:31:53,659 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:31:53,659 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 19:31:53,659 - metric: "('micro avg', 'f1-score')" 2023-10-16 19:31:53,659 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:31:53,659 Computation: 2023-10-16 19:31:53,659 - compute on device: cuda:0 2023-10-16 19:31:53,659 - embedding storage: none 2023-10-16 19:31:53,659 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:31:53,659 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-16 19:31:53,659 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:31:53,659 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:31:55,223 epoch 1 - iter 27/272 - loss 2.95328015 - time (sec): 1.56 - samples/sec: 3068.64 - lr: 0.000005 - momentum: 0.000000 2023-10-16 19:31:56,836 epoch 1 - iter 54/272 - loss 2.16693759 - time (sec): 3.18 - samples/sec: 3231.27 - lr: 0.000010 - momentum: 0.000000 2023-10-16 19:31:58,277 epoch 1 - iter 81/272 - loss 1.65058382 - time (sec): 4.62 - samples/sec: 3248.92 - lr: 0.000015 - momentum: 0.000000 2023-10-16 19:31:59,693 epoch 1 - iter 108/272 - loss 1.41300806 - time (sec): 6.03 - samples/sec: 3246.96 - lr: 0.000020 - momentum: 0.000000 2023-10-16 19:32:01,227 epoch 1 - iter 135/272 - loss 1.16296640 - time (sec): 7.57 - samples/sec: 3364.72 - lr: 0.000025 - momentum: 0.000000 2023-10-16 19:32:02,735 epoch 1 - iter 162/272 - loss 1.02940910 - time (sec): 9.07 - samples/sec: 3352.61 - lr: 0.000030 - momentum: 0.000000 2023-10-16 19:32:04,318 epoch 1 - iter 189/272 - loss 0.91107036 - time (sec): 10.66 - samples/sec: 3348.23 - lr: 0.000035 - momentum: 0.000000 2023-10-16 19:32:05,793 epoch 1 - iter 216/272 - loss 0.83286521 - time (sec): 12.13 - samples/sec: 3351.30 - lr: 0.000040 - momentum: 0.000000 2023-10-16 19:32:07,395 epoch 1 - iter 243/272 - loss 0.76057220 - time (sec): 13.73 - samples/sec: 3355.03 - lr: 0.000044 - momentum: 0.000000 2023-10-16 19:32:08,993 epoch 1 - iter 270/272 - loss 0.69365828 - time (sec): 15.33 - samples/sec: 3371.47 - lr: 0.000049 - momentum: 0.000000 2023-10-16 19:32:09,106 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:32:09,106 EPOCH 1 done: loss 0.6924 - lr: 0.000049 2023-10-16 19:32:10,110 DEV : loss 0.19004914164543152 - f1-score (micro avg) 0.5426 2023-10-16 19:32:10,114 saving best model 2023-10-16 19:32:10,471 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:32:12,167 epoch 2 - iter 27/272 - loss 0.19414973 - time (sec): 1.69 - samples/sec: 3479.42 - lr: 0.000049 - momentum: 0.000000 2023-10-16 19:32:13,697 epoch 2 - iter 54/272 - loss 0.17106126 - time (sec): 3.22 - samples/sec: 3302.48 - lr: 0.000049 - momentum: 0.000000 2023-10-16 19:32:15,263 epoch 2 - iter 81/272 - loss 0.16459301 - time (sec): 4.79 - samples/sec: 3368.49 - lr: 0.000048 - momentum: 0.000000 2023-10-16 19:32:16,827 epoch 2 - iter 108/272 - loss 0.18139997 - time (sec): 6.35 - samples/sec: 3303.45 - lr: 0.000048 - momentum: 0.000000 2023-10-16 19:32:18,369 epoch 2 - iter 135/272 - loss 0.17164269 - time (sec): 7.90 - samples/sec: 3282.85 - lr: 0.000047 - momentum: 0.000000 2023-10-16 19:32:19,835 epoch 2 - iter 162/272 - loss 0.16439995 - time (sec): 9.36 - samples/sec: 3238.76 - lr: 0.000047 - momentum: 0.000000 2023-10-16 19:32:21,448 epoch 2 - iter 189/272 - loss 0.15959924 - time (sec): 10.98 - samples/sec: 3291.21 - lr: 0.000046 - momentum: 0.000000 2023-10-16 19:32:22,926 epoch 2 - iter 216/272 - loss 0.15587286 - time (sec): 12.45 - samples/sec: 3330.90 - lr: 0.000046 - momentum: 0.000000 2023-10-16 19:32:24,472 epoch 2 - iter 243/272 - loss 0.15185873 - time (sec): 14.00 - samples/sec: 3313.00 - lr: 0.000045 - momentum: 0.000000 2023-10-16 19:32:26,009 epoch 2 - iter 270/272 - loss 0.14816339 - time (sec): 15.54 - samples/sec: 3331.01 - lr: 0.000045 - momentum: 0.000000 2023-10-16 19:32:26,108 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:32:26,108 EPOCH 2 done: loss 0.1480 - lr: 0.000045 2023-10-16 19:32:27,518 DEV : loss 0.10820505768060684 - f1-score (micro avg) 0.7463 2023-10-16 19:32:27,522 saving best model 2023-10-16 19:32:27,965 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:32:29,500 epoch 3 - iter 27/272 - loss 0.08155325 - time (sec): 1.53 - samples/sec: 3281.37 - lr: 0.000044 - momentum: 0.000000 2023-10-16 19:32:30,876 epoch 3 - iter 54/272 - loss 0.08409408 - time (sec): 2.91 - samples/sec: 3254.26 - lr: 0.000043 - momentum: 0.000000 2023-10-16 19:32:32,578 epoch 3 - iter 81/272 - loss 0.08139739 - time (sec): 4.61 - samples/sec: 3376.79 - lr: 0.000043 - momentum: 0.000000 2023-10-16 19:32:34,076 epoch 3 - iter 108/272 - loss 0.08180166 - time (sec): 6.11 - samples/sec: 3332.59 - lr: 0.000042 - momentum: 0.000000 2023-10-16 19:32:35,533 epoch 3 - iter 135/272 - loss 0.07974796 - time (sec): 7.57 - samples/sec: 3281.81 - lr: 0.000042 - momentum: 0.000000 2023-10-16 19:32:37,161 epoch 3 - iter 162/272 - loss 0.08213381 - time (sec): 9.19 - samples/sec: 3327.65 - lr: 0.000041 - momentum: 0.000000 2023-10-16 19:32:38,724 epoch 3 - iter 189/272 - loss 0.08010112 - time (sec): 10.76 - samples/sec: 3301.80 - lr: 0.000041 - momentum: 0.000000 2023-10-16 19:32:40,302 epoch 3 - iter 216/272 - loss 0.08185408 - time (sec): 12.34 - samples/sec: 3301.59 - lr: 0.000040 - momentum: 0.000000 2023-10-16 19:32:41,794 epoch 3 - iter 243/272 - loss 0.08367674 - time (sec): 13.83 - samples/sec: 3290.53 - lr: 0.000040 - momentum: 0.000000 2023-10-16 19:32:43,430 epoch 3 - iter 270/272 - loss 0.08124787 - time (sec): 15.46 - samples/sec: 3336.37 - lr: 0.000039 - momentum: 0.000000 2023-10-16 19:32:43,548 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:32:43,548 EPOCH 3 done: loss 0.0809 - lr: 0.000039 2023-10-16 19:32:44,984 DEV : loss 0.10929891467094421 - f1-score (micro avg) 0.7446 2023-10-16 19:32:44,988 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:32:46,576 epoch 4 - iter 27/272 - loss 0.05437417 - time (sec): 1.59 - samples/sec: 3533.96 - lr: 0.000038 - momentum: 0.000000 2023-10-16 19:32:48,099 epoch 4 - iter 54/272 - loss 0.04384694 - time (sec): 3.11 - samples/sec: 3554.17 - lr: 0.000038 - momentum: 0.000000 2023-10-16 19:32:49,691 epoch 4 - iter 81/272 - loss 0.04594154 - time (sec): 4.70 - samples/sec: 3477.23 - lr: 0.000037 - momentum: 0.000000 2023-10-16 19:32:51,240 epoch 4 - iter 108/272 - loss 0.05424351 - time (sec): 6.25 - samples/sec: 3424.77 - lr: 0.000037 - momentum: 0.000000 2023-10-16 19:32:52,810 epoch 4 - iter 135/272 - loss 0.04989679 - time (sec): 7.82 - samples/sec: 3412.90 - lr: 0.000036 - momentum: 0.000000 2023-10-16 19:32:54,386 epoch 4 - iter 162/272 - loss 0.04978778 - time (sec): 9.40 - samples/sec: 3389.52 - lr: 0.000036 - momentum: 0.000000 2023-10-16 19:32:56,065 epoch 4 - iter 189/272 - loss 0.05027613 - time (sec): 11.08 - samples/sec: 3392.18 - lr: 0.000035 - momentum: 0.000000 2023-10-16 19:32:57,704 epoch 4 - iter 216/272 - loss 0.05205747 - time (sec): 12.71 - samples/sec: 3307.15 - lr: 0.000034 - momentum: 0.000000 2023-10-16 19:32:59,252 epoch 4 - iter 243/272 - loss 0.05199972 - time (sec): 14.26 - samples/sec: 3315.73 - lr: 0.000034 - momentum: 0.000000 2023-10-16 19:33:00,770 epoch 4 - iter 270/272 - loss 0.05396688 - time (sec): 15.78 - samples/sec: 3289.80 - lr: 0.000033 - momentum: 0.000000 2023-10-16 19:33:00,848 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:33:00,849 EPOCH 4 done: loss 0.0539 - lr: 0.000033 2023-10-16 19:33:02,295 DEV : loss 0.12052459269762039 - f1-score (micro avg) 0.8095 2023-10-16 19:33:02,299 saving best model 2023-10-16 19:33:02,747 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:33:04,363 epoch 5 - iter 27/272 - loss 0.04257767 - time (sec): 1.61 - samples/sec: 3332.16 - lr: 0.000033 - momentum: 0.000000 2023-10-16 19:33:05,927 epoch 5 - iter 54/272 - loss 0.03104477 - time (sec): 3.17 - samples/sec: 3234.01 - lr: 0.000032 - momentum: 0.000000 2023-10-16 19:33:07,592 epoch 5 - iter 81/272 - loss 0.02769835 - time (sec): 4.84 - samples/sec: 3363.99 - lr: 0.000032 - momentum: 0.000000 2023-10-16 19:33:09,122 epoch 5 - iter 108/272 - loss 0.03682368 - time (sec): 6.37 - samples/sec: 3352.56 - lr: 0.000031 - momentum: 0.000000 2023-10-16 19:33:10,626 epoch 5 - iter 135/272 - loss 0.04182425 - time (sec): 7.87 - samples/sec: 3379.02 - lr: 0.000031 - momentum: 0.000000 2023-10-16 19:33:12,140 epoch 5 - iter 162/272 - loss 0.04003463 - time (sec): 9.38 - samples/sec: 3316.19 - lr: 0.000030 - momentum: 0.000000 2023-10-16 19:33:13,737 epoch 5 - iter 189/272 - loss 0.03933014 - time (sec): 10.98 - samples/sec: 3353.80 - lr: 0.000029 - momentum: 0.000000 2023-10-16 19:33:15,308 epoch 5 - iter 216/272 - loss 0.03933942 - time (sec): 12.55 - samples/sec: 3338.80 - lr: 0.000029 - momentum: 0.000000 2023-10-16 19:33:16,911 epoch 5 - iter 243/272 - loss 0.03741570 - time (sec): 14.16 - samples/sec: 3358.22 - lr: 0.000028 - momentum: 0.000000 2023-10-16 19:33:18,380 epoch 5 - iter 270/272 - loss 0.03884246 - time (sec): 15.62 - samples/sec: 3322.35 - lr: 0.000028 - momentum: 0.000000 2023-10-16 19:33:18,463 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:33:18,463 EPOCH 5 done: loss 0.0388 - lr: 0.000028 2023-10-16 19:33:19,898 DEV : loss 0.1621648073196411 - f1-score (micro avg) 0.7891 2023-10-16 19:33:19,901 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:33:21,401 epoch 6 - iter 27/272 - loss 0.01912861 - time (sec): 1.50 - samples/sec: 3262.46 - lr: 0.000027 - momentum: 0.000000 2023-10-16 19:33:22,800 epoch 6 - iter 54/272 - loss 0.02221352 - time (sec): 2.90 - samples/sec: 3284.19 - lr: 0.000027 - momentum: 0.000000 2023-10-16 19:33:24,348 epoch 6 - iter 81/272 - loss 0.02732011 - time (sec): 4.45 - samples/sec: 3290.85 - lr: 0.000026 - momentum: 0.000000 2023-10-16 19:33:26,088 epoch 6 - iter 108/272 - loss 0.02665238 - time (sec): 6.19 - samples/sec: 3331.01 - lr: 0.000026 - momentum: 0.000000 2023-10-16 19:33:27,525 epoch 6 - iter 135/272 - loss 0.03098490 - time (sec): 7.62 - samples/sec: 3374.30 - lr: 0.000025 - momentum: 0.000000 2023-10-16 19:33:29,079 epoch 6 - iter 162/272 - loss 0.03126089 - time (sec): 9.18 - samples/sec: 3286.18 - lr: 0.000024 - momentum: 0.000000 2023-10-16 19:33:30,706 epoch 6 - iter 189/272 - loss 0.03235828 - time (sec): 10.80 - samples/sec: 3258.39 - lr: 0.000024 - momentum: 0.000000 2023-10-16 19:33:32,321 epoch 6 - iter 216/272 - loss 0.03110078 - time (sec): 12.42 - samples/sec: 3283.92 - lr: 0.000023 - momentum: 0.000000 2023-10-16 19:33:33,890 epoch 6 - iter 243/272 - loss 0.02991518 - time (sec): 13.99 - samples/sec: 3270.91 - lr: 0.000023 - momentum: 0.000000 2023-10-16 19:33:35,585 epoch 6 - iter 270/272 - loss 0.02933351 - time (sec): 15.68 - samples/sec: 3309.80 - lr: 0.000022 - momentum: 0.000000 2023-10-16 19:33:35,675 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:33:35,675 EPOCH 6 done: loss 0.0293 - lr: 0.000022 2023-10-16 19:33:37,090 DEV : loss 0.14851412177085876 - f1-score (micro avg) 0.8433 2023-10-16 19:33:37,094 saving best model 2023-10-16 19:33:37,537 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:33:39,088 epoch 7 - iter 27/272 - loss 0.03104110 - time (sec): 1.55 - samples/sec: 3804.25 - lr: 0.000022 - momentum: 0.000000 2023-10-16 19:33:40,630 epoch 7 - iter 54/272 - loss 0.02299984 - time (sec): 3.09 - samples/sec: 3494.92 - lr: 0.000021 - momentum: 0.000000 2023-10-16 19:33:42,130 epoch 7 - iter 81/272 - loss 0.02190619 - time (sec): 4.59 - samples/sec: 3523.84 - lr: 0.000021 - momentum: 0.000000 2023-10-16 19:33:43,696 epoch 7 - iter 108/272 - loss 0.02242868 - time (sec): 6.16 - samples/sec: 3465.79 - lr: 0.000020 - momentum: 0.000000 2023-10-16 19:33:45,260 epoch 7 - iter 135/272 - loss 0.02177196 - time (sec): 7.72 - samples/sec: 3405.58 - lr: 0.000019 - momentum: 0.000000 2023-10-16 19:33:46,851 epoch 7 - iter 162/272 - loss 0.02050069 - time (sec): 9.31 - samples/sec: 3408.90 - lr: 0.000019 - momentum: 0.000000 2023-10-16 19:33:48,584 epoch 7 - iter 189/272 - loss 0.01812006 - time (sec): 11.04 - samples/sec: 3382.28 - lr: 0.000018 - momentum: 0.000000 2023-10-16 19:33:50,044 epoch 7 - iter 216/272 - loss 0.01834730 - time (sec): 12.50 - samples/sec: 3345.05 - lr: 0.000018 - momentum: 0.000000 2023-10-16 19:33:51,565 epoch 7 - iter 243/272 - loss 0.02112046 - time (sec): 14.02 - samples/sec: 3358.62 - lr: 0.000017 - momentum: 0.000000 2023-10-16 19:33:53,115 epoch 7 - iter 270/272 - loss 0.02135027 - time (sec): 15.57 - samples/sec: 3319.75 - lr: 0.000017 - momentum: 0.000000 2023-10-16 19:33:53,200 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:33:53,200 EPOCH 7 done: loss 0.0212 - lr: 0.000017 2023-10-16 19:33:54,810 DEV : loss 0.16008441150188446 - f1-score (micro avg) 0.8207 2023-10-16 19:33:54,814 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:33:56,499 epoch 8 - iter 27/272 - loss 0.01358068 - time (sec): 1.68 - samples/sec: 3467.90 - lr: 0.000016 - momentum: 0.000000 2023-10-16 19:33:58,092 epoch 8 - iter 54/272 - loss 0.01355086 - time (sec): 3.28 - samples/sec: 3411.86 - lr: 0.000016 - momentum: 0.000000 2023-10-16 19:33:59,532 epoch 8 - iter 81/272 - loss 0.01197700 - time (sec): 4.72 - samples/sec: 3422.20 - lr: 0.000015 - momentum: 0.000000 2023-10-16 19:34:01,011 epoch 8 - iter 108/272 - loss 0.01122890 - time (sec): 6.20 - samples/sec: 3386.16 - lr: 0.000014 - momentum: 0.000000 2023-10-16 19:34:02,554 epoch 8 - iter 135/272 - loss 0.01385527 - time (sec): 7.74 - samples/sec: 3427.09 - lr: 0.000014 - momentum: 0.000000 2023-10-16 19:34:04,081 epoch 8 - iter 162/272 - loss 0.01323604 - time (sec): 9.27 - samples/sec: 3411.87 - lr: 0.000013 - momentum: 0.000000 2023-10-16 19:34:05,881 epoch 8 - iter 189/272 - loss 0.01363279 - time (sec): 11.07 - samples/sec: 3430.21 - lr: 0.000013 - momentum: 0.000000 2023-10-16 19:34:07,477 epoch 8 - iter 216/272 - loss 0.01337961 - time (sec): 12.66 - samples/sec: 3382.58 - lr: 0.000012 - momentum: 0.000000 2023-10-16 19:34:08,939 epoch 8 - iter 243/272 - loss 0.01371133 - time (sec): 14.12 - samples/sec: 3351.16 - lr: 0.000012 - momentum: 0.000000 2023-10-16 19:34:10,433 epoch 8 - iter 270/272 - loss 0.01409468 - time (sec): 15.62 - samples/sec: 3321.67 - lr: 0.000011 - momentum: 0.000000 2023-10-16 19:34:10,520 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:34:10,521 EPOCH 8 done: loss 0.0141 - lr: 0.000011 2023-10-16 19:34:11,958 DEV : loss 0.17060722410678864 - f1-score (micro avg) 0.8088 2023-10-16 19:34:11,962 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:34:13,452 epoch 9 - iter 27/272 - loss 0.02253323 - time (sec): 1.49 - samples/sec: 3402.69 - lr: 0.000011 - momentum: 0.000000 2023-10-16 19:34:15,039 epoch 9 - iter 54/272 - loss 0.02096247 - time (sec): 3.08 - samples/sec: 3492.39 - lr: 0.000010 - momentum: 0.000000 2023-10-16 19:34:16,608 epoch 9 - iter 81/272 - loss 0.01767350 - time (sec): 4.65 - samples/sec: 3550.63 - lr: 0.000009 - momentum: 0.000000 2023-10-16 19:34:18,108 epoch 9 - iter 108/272 - loss 0.01812866 - time (sec): 6.15 - samples/sec: 3553.70 - lr: 0.000009 - momentum: 0.000000 2023-10-16 19:34:19,612 epoch 9 - iter 135/272 - loss 0.01657397 - time (sec): 7.65 - samples/sec: 3446.40 - lr: 0.000008 - momentum: 0.000000 2023-10-16 19:34:21,155 epoch 9 - iter 162/272 - loss 0.01586682 - time (sec): 9.19 - samples/sec: 3350.79 - lr: 0.000008 - momentum: 0.000000 2023-10-16 19:34:22,715 epoch 9 - iter 189/272 - loss 0.01440847 - time (sec): 10.75 - samples/sec: 3383.69 - lr: 0.000007 - momentum: 0.000000 2023-10-16 19:34:24,274 epoch 9 - iter 216/272 - loss 0.01282252 - time (sec): 12.31 - samples/sec: 3381.02 - lr: 0.000007 - momentum: 0.000000 2023-10-16 19:34:25,956 epoch 9 - iter 243/272 - loss 0.01219213 - time (sec): 13.99 - samples/sec: 3364.21 - lr: 0.000006 - momentum: 0.000000 2023-10-16 19:34:27,400 epoch 9 - iter 270/272 - loss 0.01184073 - time (sec): 15.44 - samples/sec: 3354.34 - lr: 0.000006 - momentum: 0.000000 2023-10-16 19:34:27,499 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:34:27,499 EPOCH 9 done: loss 0.0118 - lr: 0.000006 2023-10-16 19:34:28,911 DEV : loss 0.17339861392974854 - f1-score (micro avg) 0.7993 2023-10-16 19:34:28,915 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:34:30,641 epoch 10 - iter 27/272 - loss 0.00454535 - time (sec): 1.72 - samples/sec: 3068.80 - lr: 0.000005 - momentum: 0.000000 2023-10-16 19:34:32,245 epoch 10 - iter 54/272 - loss 0.00270348 - time (sec): 3.33 - samples/sec: 3229.23 - lr: 0.000004 - momentum: 0.000000 2023-10-16 19:34:33,797 epoch 10 - iter 81/272 - loss 0.00830662 - time (sec): 4.88 - samples/sec: 3230.52 - lr: 0.000004 - momentum: 0.000000 2023-10-16 19:34:35,204 epoch 10 - iter 108/272 - loss 0.00919675 - time (sec): 6.29 - samples/sec: 3246.34 - lr: 0.000003 - momentum: 0.000000 2023-10-16 19:34:36,732 epoch 10 - iter 135/272 - loss 0.00958562 - time (sec): 7.82 - samples/sec: 3213.56 - lr: 0.000003 - momentum: 0.000000 2023-10-16 19:34:38,281 epoch 10 - iter 162/272 - loss 0.00906453 - time (sec): 9.37 - samples/sec: 3227.96 - lr: 0.000002 - momentum: 0.000000 2023-10-16 19:34:39,835 epoch 10 - iter 189/272 - loss 0.00785760 - time (sec): 10.92 - samples/sec: 3247.03 - lr: 0.000002 - momentum: 0.000000 2023-10-16 19:34:41,312 epoch 10 - iter 216/272 - loss 0.00743252 - time (sec): 12.40 - samples/sec: 3274.69 - lr: 0.000001 - momentum: 0.000000 2023-10-16 19:34:42,878 epoch 10 - iter 243/272 - loss 0.00781445 - time (sec): 13.96 - samples/sec: 3353.35 - lr: 0.000001 - momentum: 0.000000 2023-10-16 19:34:44,315 epoch 10 - iter 270/272 - loss 0.00738827 - time (sec): 15.40 - samples/sec: 3365.36 - lr: 0.000000 - momentum: 0.000000 2023-10-16 19:34:44,394 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:34:44,394 EPOCH 10 done: loss 0.0074 - lr: 0.000000 2023-10-16 19:34:45,829 DEV : loss 0.17555038630962372 - f1-score (micro avg) 0.811 2023-10-16 19:34:46,196 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:34:46,197 Loading model from best epoch ... 2023-10-16 19:34:47,567 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-16 19:34:49,746 Results: - F-score (micro) 0.7744 - F-score (macro) 0.7533 - Accuracy 0.6521 By class: precision recall f1-score support LOC 0.7982 0.8494 0.8230 312 PER 0.6679 0.8606 0.7521 208 ORG 0.5417 0.4727 0.5049 55 HumanProd 0.9130 0.9545 0.9333 22 micro avg 0.7317 0.8224 0.7744 597 macro avg 0.7302 0.7843 0.7533 597 weighted avg 0.7334 0.8224 0.7730 597 2023-10-16 19:34:49,746 ----------------------------------------------------------------------------------------------------