|
2023-10-18 16:06:26,406 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:26,407 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=25, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 16:06:26,407 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:26,407 MultiCorpus: 1214 train + 266 dev + 251 test sentences |
|
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator |
|
2023-10-18 16:06:26,407 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:26,407 Train: 1214 sentences |
|
2023-10-18 16:06:26,407 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 16:06:26,407 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:26,407 Training Params: |
|
2023-10-18 16:06:26,407 - learning_rate: "5e-05" |
|
2023-10-18 16:06:26,407 - mini_batch_size: "4" |
|
2023-10-18 16:06:26,407 - max_epochs: "10" |
|
2023-10-18 16:06:26,407 - shuffle: "True" |
|
2023-10-18 16:06:26,407 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:26,407 Plugins: |
|
2023-10-18 16:06:26,407 - TensorboardLogger |
|
2023-10-18 16:06:26,407 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 16:06:26,407 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:26,407 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 16:06:26,407 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 16:06:26,407 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:26,407 Computation: |
|
2023-10-18 16:06:26,407 - compute on device: cuda:0 |
|
2023-10-18 16:06:26,407 - embedding storage: none |
|
2023-10-18 16:06:26,407 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:26,407 Model training base path: "hmbench-ajmc/en-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-18 16:06:26,408 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:26,408 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:26,408 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 16:06:26,904 epoch 1 - iter 30/304 - loss 3.63641336 - time (sec): 0.50 - samples/sec: 5839.52 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:06:27,388 epoch 1 - iter 60/304 - loss 3.54820873 - time (sec): 0.98 - samples/sec: 5932.33 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:06:27,879 epoch 1 - iter 90/304 - loss 3.31806993 - time (sec): 1.47 - samples/sec: 6193.91 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:06:28,336 epoch 1 - iter 120/304 - loss 3.05677448 - time (sec): 1.93 - samples/sec: 6398.95 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 16:06:28,788 epoch 1 - iter 150/304 - loss 2.80974247 - time (sec): 2.38 - samples/sec: 6433.95 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 16:06:29,251 epoch 1 - iter 180/304 - loss 2.53281700 - time (sec): 2.84 - samples/sec: 6453.25 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 16:06:29,714 epoch 1 - iter 210/304 - loss 2.29809551 - time (sec): 3.31 - samples/sec: 6536.86 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 16:06:30,182 epoch 1 - iter 240/304 - loss 2.11488418 - time (sec): 3.77 - samples/sec: 6527.55 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 16:06:30,652 epoch 1 - iter 270/304 - loss 1.95911561 - time (sec): 4.24 - samples/sec: 6565.29 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 16:06:31,112 epoch 1 - iter 300/304 - loss 1.85536093 - time (sec): 4.70 - samples/sec: 6521.89 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 16:06:31,172 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:31,172 EPOCH 1 done: loss 1.8488 - lr: 0.000049 |
|
2023-10-18 16:06:31,503 DEV : loss 0.6380395889282227 - f1-score (micro avg) 0.0 |
|
2023-10-18 16:06:31,508 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:31,965 epoch 2 - iter 30/304 - loss 0.60676131 - time (sec): 0.46 - samples/sec: 6756.18 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 16:06:32,428 epoch 2 - iter 60/304 - loss 0.65323322 - time (sec): 0.92 - samples/sec: 6966.63 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 16:06:32,879 epoch 2 - iter 90/304 - loss 0.60717496 - time (sec): 1.37 - samples/sec: 6799.14 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 16:06:33,339 epoch 2 - iter 120/304 - loss 0.59920224 - time (sec): 1.83 - samples/sec: 6561.00 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 16:06:33,772 epoch 2 - iter 150/304 - loss 0.59005632 - time (sec): 2.26 - samples/sec: 6655.96 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 16:06:34,183 epoch 2 - iter 180/304 - loss 0.57517008 - time (sec): 2.67 - samples/sec: 6783.63 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 16:06:34,593 epoch 2 - iter 210/304 - loss 0.56930138 - time (sec): 3.08 - samples/sec: 6871.36 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 16:06:35,006 epoch 2 - iter 240/304 - loss 0.55756636 - time (sec): 3.50 - samples/sec: 6991.25 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 16:06:35,425 epoch 2 - iter 270/304 - loss 0.54208762 - time (sec): 3.92 - samples/sec: 7067.81 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 16:06:35,835 epoch 2 - iter 300/304 - loss 0.53163100 - time (sec): 4.33 - samples/sec: 7107.19 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 16:06:35,886 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:35,886 EPOCH 2 done: loss 0.5293 - lr: 0.000045 |
|
2023-10-18 16:06:36,558 DEV : loss 0.38052845001220703 - f1-score (micro avg) 0.306 |
|
2023-10-18 16:06:36,565 saving best model |
|
2023-10-18 16:06:36,598 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:37,100 epoch 3 - iter 30/304 - loss 0.35465457 - time (sec): 0.50 - samples/sec: 6233.67 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 16:06:37,584 epoch 3 - iter 60/304 - loss 0.39414242 - time (sec): 0.99 - samples/sec: 6394.57 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 16:06:38,044 epoch 3 - iter 90/304 - loss 0.40778876 - time (sec): 1.45 - samples/sec: 6434.63 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 16:06:38,505 epoch 3 - iter 120/304 - loss 0.39545984 - time (sec): 1.91 - samples/sec: 6535.82 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 16:06:38,976 epoch 3 - iter 150/304 - loss 0.38696959 - time (sec): 2.38 - samples/sec: 6635.61 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 16:06:39,440 epoch 3 - iter 180/304 - loss 0.39848795 - time (sec): 2.84 - samples/sec: 6664.89 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 16:06:39,919 epoch 3 - iter 210/304 - loss 0.38685546 - time (sec): 3.32 - samples/sec: 6669.79 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 16:06:40,370 epoch 3 - iter 240/304 - loss 0.39455687 - time (sec): 3.77 - samples/sec: 6673.86 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 16:06:40,813 epoch 3 - iter 270/304 - loss 0.39164970 - time (sec): 4.21 - samples/sec: 6628.89 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 16:06:41,271 epoch 3 - iter 300/304 - loss 0.38611050 - time (sec): 4.67 - samples/sec: 6567.69 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 16:06:41,326 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:41,326 EPOCH 3 done: loss 0.3864 - lr: 0.000039 |
|
2023-10-18 16:06:41,835 DEV : loss 0.33103644847869873 - f1-score (micro avg) 0.4195 |
|
2023-10-18 16:06:41,840 saving best model |
|
2023-10-18 16:06:41,875 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:42,328 epoch 4 - iter 30/304 - loss 0.30118792 - time (sec): 0.45 - samples/sec: 6618.99 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 16:06:42,782 epoch 4 - iter 60/304 - loss 0.35784127 - time (sec): 0.91 - samples/sec: 6748.65 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 16:06:43,209 epoch 4 - iter 90/304 - loss 0.33839697 - time (sec): 1.33 - samples/sec: 6888.22 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 16:06:43,641 epoch 4 - iter 120/304 - loss 0.35027725 - time (sec): 1.77 - samples/sec: 6962.41 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 16:06:44,060 epoch 4 - iter 150/304 - loss 0.34757242 - time (sec): 2.18 - samples/sec: 7103.58 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 16:06:44,504 epoch 4 - iter 180/304 - loss 0.34869245 - time (sec): 2.63 - samples/sec: 7100.60 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 16:06:44,960 epoch 4 - iter 210/304 - loss 0.34345179 - time (sec): 3.08 - samples/sec: 6982.54 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 16:06:45,408 epoch 4 - iter 240/304 - loss 0.33944270 - time (sec): 3.53 - samples/sec: 6934.31 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 16:06:45,849 epoch 4 - iter 270/304 - loss 0.34162385 - time (sec): 3.97 - samples/sec: 6883.91 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 16:06:46,309 epoch 4 - iter 300/304 - loss 0.32762499 - time (sec): 4.43 - samples/sec: 6908.03 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 16:06:46,366 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:46,366 EPOCH 4 done: loss 0.3276 - lr: 0.000033 |
|
2023-10-18 16:06:46,897 DEV : loss 0.29201236367225647 - f1-score (micro avg) 0.4564 |
|
2023-10-18 16:06:46,902 saving best model |
|
2023-10-18 16:06:46,936 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:47,394 epoch 5 - iter 30/304 - loss 0.28009451 - time (sec): 0.46 - samples/sec: 6656.21 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 16:06:47,846 epoch 5 - iter 60/304 - loss 0.27209313 - time (sec): 0.91 - samples/sec: 6648.88 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 16:06:48,303 epoch 5 - iter 90/304 - loss 0.25394650 - time (sec): 1.37 - samples/sec: 6541.63 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 16:06:48,768 epoch 5 - iter 120/304 - loss 0.25718523 - time (sec): 1.83 - samples/sec: 6438.76 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 16:06:49,233 epoch 5 - iter 150/304 - loss 0.27143920 - time (sec): 2.30 - samples/sec: 6544.66 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 16:06:49,682 epoch 5 - iter 180/304 - loss 0.27977989 - time (sec): 2.75 - samples/sec: 6621.98 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 16:06:50,142 epoch 5 - iter 210/304 - loss 0.28379924 - time (sec): 3.21 - samples/sec: 6673.00 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 16:06:50,591 epoch 5 - iter 240/304 - loss 0.28825810 - time (sec): 3.66 - samples/sec: 6745.64 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 16:06:51,035 epoch 5 - iter 270/304 - loss 0.29107402 - time (sec): 4.10 - samples/sec: 6730.89 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 16:06:51,489 epoch 5 - iter 300/304 - loss 0.28983186 - time (sec): 4.55 - samples/sec: 6730.81 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 16:06:51,548 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:51,548 EPOCH 5 done: loss 0.2887 - lr: 0.000028 |
|
2023-10-18 16:06:52,064 DEV : loss 0.2696332335472107 - f1-score (micro avg) 0.5138 |
|
2023-10-18 16:06:52,069 saving best model |
|
2023-10-18 16:06:52,101 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:52,557 epoch 6 - iter 30/304 - loss 0.20984348 - time (sec): 0.46 - samples/sec: 6052.75 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:06:53,027 epoch 6 - iter 60/304 - loss 0.27029612 - time (sec): 0.92 - samples/sec: 6432.18 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:06:53,501 epoch 6 - iter 90/304 - loss 0.27915798 - time (sec): 1.40 - samples/sec: 6629.49 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 16:06:53,955 epoch 6 - iter 120/304 - loss 0.27192028 - time (sec): 1.85 - samples/sec: 6569.99 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 16:06:54,402 epoch 6 - iter 150/304 - loss 0.25964604 - time (sec): 2.30 - samples/sec: 6568.76 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 16:06:54,855 epoch 6 - iter 180/304 - loss 0.25797220 - time (sec): 2.75 - samples/sec: 6629.88 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 16:06:55,311 epoch 6 - iter 210/304 - loss 0.25227825 - time (sec): 3.21 - samples/sec: 6540.24 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 16:06:55,770 epoch 6 - iter 240/304 - loss 0.25623935 - time (sec): 3.67 - samples/sec: 6620.09 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 16:06:56,225 epoch 6 - iter 270/304 - loss 0.26774423 - time (sec): 4.12 - samples/sec: 6667.91 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 16:06:56,681 epoch 6 - iter 300/304 - loss 0.26844020 - time (sec): 4.58 - samples/sec: 6703.84 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 16:06:56,738 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:56,738 EPOCH 6 done: loss 0.2694 - lr: 0.000022 |
|
2023-10-18 16:06:57,252 DEV : loss 0.2644450068473816 - f1-score (micro avg) 0.5312 |
|
2023-10-18 16:06:57,257 saving best model |
|
2023-10-18 16:06:57,290 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:06:57,744 epoch 7 - iter 30/304 - loss 0.25616094 - time (sec): 0.45 - samples/sec: 6643.49 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 16:06:58,207 epoch 7 - iter 60/304 - loss 0.26253744 - time (sec): 0.92 - samples/sec: 6840.85 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:06:58,674 epoch 7 - iter 90/304 - loss 0.25055443 - time (sec): 1.38 - samples/sec: 6912.58 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:06:59,129 epoch 7 - iter 120/304 - loss 0.24572083 - time (sec): 1.84 - samples/sec: 6842.97 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 16:06:59,592 epoch 7 - iter 150/304 - loss 0.25680164 - time (sec): 2.30 - samples/sec: 6712.14 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 16:07:00,046 epoch 7 - iter 180/304 - loss 0.25355562 - time (sec): 2.76 - samples/sec: 6633.31 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 16:07:00,488 epoch 7 - iter 210/304 - loss 0.25134230 - time (sec): 3.20 - samples/sec: 6612.69 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 16:07:00,936 epoch 7 - iter 240/304 - loss 0.24499226 - time (sec): 3.65 - samples/sec: 6615.74 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 16:07:01,386 epoch 7 - iter 270/304 - loss 0.25248865 - time (sec): 4.10 - samples/sec: 6671.82 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 16:07:01,847 epoch 7 - iter 300/304 - loss 0.25127369 - time (sec): 4.56 - samples/sec: 6717.57 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 16:07:01,905 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:07:01,905 EPOCH 7 done: loss 0.2525 - lr: 0.000017 |
|
2023-10-18 16:07:02,422 DEV : loss 0.2524552643299103 - f1-score (micro avg) 0.5297 |
|
2023-10-18 16:07:02,427 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:07:02,890 epoch 8 - iter 30/304 - loss 0.22062370 - time (sec): 0.46 - samples/sec: 6270.89 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 16:07:03,344 epoch 8 - iter 60/304 - loss 0.23611162 - time (sec): 0.92 - samples/sec: 6431.72 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 16:07:03,796 epoch 8 - iter 90/304 - loss 0.23691160 - time (sec): 1.37 - samples/sec: 6458.59 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:07:04,262 epoch 8 - iter 120/304 - loss 0.24118079 - time (sec): 1.83 - samples/sec: 6513.73 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:07:04,726 epoch 8 - iter 150/304 - loss 0.23513735 - time (sec): 2.30 - samples/sec: 6590.01 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 16:07:05,196 epoch 8 - iter 180/304 - loss 0.22913159 - time (sec): 2.77 - samples/sec: 6630.72 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 16:07:05,648 epoch 8 - iter 210/304 - loss 0.22997990 - time (sec): 3.22 - samples/sec: 6672.39 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 16:07:06,102 epoch 8 - iter 240/304 - loss 0.23022244 - time (sec): 3.67 - samples/sec: 6676.85 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:07:06,555 epoch 8 - iter 270/304 - loss 0.23263650 - time (sec): 4.13 - samples/sec: 6694.71 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:07:07,016 epoch 8 - iter 300/304 - loss 0.23657377 - time (sec): 4.59 - samples/sec: 6656.96 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 16:07:07,078 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:07:07,078 EPOCH 8 done: loss 0.2365 - lr: 0.000011 |
|
2023-10-18 16:07:07,590 DEV : loss 0.24281737208366394 - f1-score (micro avg) 0.5434 |
|
2023-10-18 16:07:07,596 saving best model |
|
2023-10-18 16:07:07,628 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:07:08,085 epoch 9 - iter 30/304 - loss 0.23377814 - time (sec): 0.46 - samples/sec: 6978.69 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 16:07:08,540 epoch 9 - iter 60/304 - loss 0.20928380 - time (sec): 0.91 - samples/sec: 6742.65 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:07:09,013 epoch 9 - iter 90/304 - loss 0.24501137 - time (sec): 1.38 - samples/sec: 6802.44 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:07:09,477 epoch 9 - iter 120/304 - loss 0.23197914 - time (sec): 1.85 - samples/sec: 6661.82 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 16:07:09,943 epoch 9 - iter 150/304 - loss 0.23486221 - time (sec): 2.31 - samples/sec: 6628.62 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 16:07:10,391 epoch 9 - iter 180/304 - loss 0.23227933 - time (sec): 2.76 - samples/sec: 6641.13 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 16:07:10,851 epoch 9 - iter 210/304 - loss 0.23468390 - time (sec): 3.22 - samples/sec: 6650.71 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 16:07:11,315 epoch 9 - iter 240/304 - loss 0.23288710 - time (sec): 3.69 - samples/sec: 6660.45 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 16:07:11,766 epoch 9 - iter 270/304 - loss 0.23030264 - time (sec): 4.14 - samples/sec: 6642.13 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:07:12,231 epoch 9 - iter 300/304 - loss 0.22833423 - time (sec): 4.60 - samples/sec: 6651.08 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:07:12,287 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:07:12,287 EPOCH 9 done: loss 0.2287 - lr: 0.000006 |
|
2023-10-18 16:07:12,812 DEV : loss 0.24299992620944977 - f1-score (micro avg) 0.5339 |
|
2023-10-18 16:07:12,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:07:13,283 epoch 10 - iter 30/304 - loss 0.21612585 - time (sec): 0.47 - samples/sec: 6795.09 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:07:13,737 epoch 10 - iter 60/304 - loss 0.22735348 - time (sec): 0.92 - samples/sec: 7059.44 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:07:14,182 epoch 10 - iter 90/304 - loss 0.21222221 - time (sec): 1.36 - samples/sec: 6805.68 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 16:07:14,636 epoch 10 - iter 120/304 - loss 0.20604343 - time (sec): 1.82 - samples/sec: 6711.82 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:07:15,095 epoch 10 - iter 150/304 - loss 0.22006462 - time (sec): 2.28 - samples/sec: 6717.05 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:07:15,561 epoch 10 - iter 180/304 - loss 0.21726447 - time (sec): 2.74 - samples/sec: 6707.95 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 16:07:16,019 epoch 10 - iter 210/304 - loss 0.21398677 - time (sec): 3.20 - samples/sec: 6721.19 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 16:07:16,467 epoch 10 - iter 240/304 - loss 0.22196454 - time (sec): 3.65 - samples/sec: 6756.53 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 16:07:16,922 epoch 10 - iter 270/304 - loss 0.23309974 - time (sec): 4.10 - samples/sec: 6770.38 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 16:07:17,389 epoch 10 - iter 300/304 - loss 0.22925414 - time (sec): 4.57 - samples/sec: 6705.85 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 16:07:17,447 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:07:17,448 EPOCH 10 done: loss 0.2278 - lr: 0.000000 |
|
2023-10-18 16:07:17,960 DEV : loss 0.24338579177856445 - f1-score (micro avg) 0.5311 |
|
2023-10-18 16:07:17,994 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:07:17,995 Loading model from best epoch ... |
|
2023-10-18 16:07:18,075 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object |
|
2023-10-18 16:07:18,552 |
|
Results: |
|
- F-score (micro) 0.561 |
|
- F-score (macro) 0.3467 |
|
- Accuracy 0.4099 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
scope 0.5000 0.6093 0.5493 151 |
|
work 0.4103 0.6737 0.5100 95 |
|
pers 0.7317 0.6250 0.6742 96 |
|
loc 0.0000 0.0000 0.0000 3 |
|
date 0.0000 0.0000 0.0000 3 |
|
|
|
micro avg 0.5118 0.6207 0.5610 348 |
|
macro avg 0.3284 0.3816 0.3467 348 |
|
weighted avg 0.5308 0.6207 0.5635 348 |
|
|
|
2023-10-18 16:07:18,552 ---------------------------------------------------------------------------------------------------- |
|
|