robert_bilstm_mega_res-ner-msra-ner-ner-msra-ner

This model is a fine-tuned version of hfl/chinese-roberta-wwm-ext-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0621
  • Precision: 0.9538
  • Recall: 0.9573
  • F1: 0.9555
  • Accuracy: 0.9940

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
0.0239 1.0 725 0.0232 0.9242 0.9344 0.9293 0.9931
0.0139 2.0 1450 0.0254 0.9373 0.9459 0.9416 0.9925
0.006 3.0 2175 0.0294 0.9415 0.9480 0.9448 0.9930
0.0052 4.0 2900 0.0303 0.9389 0.9486 0.9437 0.9937
0.0049 5.0 3625 0.0303 0.9422 0.9498 0.9459 0.9933
0.0034 6.0 4350 0.0353 0.9411 0.9594 0.9502 0.9934
0.0015 7.0 5075 0.0372 0.9404 0.9498 0.9450 0.9927
0.0013 8.0 5800 0.0379 0.9477 0.9492 0.9485 0.9938
0.0006 9.0 6525 0.0405 0.9516 0.9502 0.9509 0.9937
0.0039 10.0 7250 0.0442 0.9420 0.9536 0.9478 0.9931
0.0013 11.0 7975 0.0393 0.9479 0.9528 0.9504 0.9936
0.001 12.0 8700 0.0431 0.9455 0.9513 0.9484 0.9933
0.0011 13.0 9425 0.0431 0.9487 0.9425 0.9455 0.9936
0.0003 14.0 10150 0.0425 0.9392 0.9450 0.9421 0.9933
0.0001 15.0 10875 0.0456 0.9475 0.9515 0.9495 0.9937
0.0011 16.0 11600 0.0446 0.9467 0.9471 0.9469 0.9928
0.0002 17.0 12325 0.0500 0.9532 0.9457 0.9495 0.9933
0.0001 18.0 13050 0.0504 0.9479 0.9490 0.9485 0.9929
0.0002 19.0 13775 0.0455 0.9463 0.9527 0.9495 0.9933
0.0013 20.0 14500 0.0471 0.9487 0.9544 0.9515 0.9933
0.0005 21.0 15225 0.0425 0.9491 0.9584 0.9537 0.9936
0.0009 22.0 15950 0.0503 0.9455 0.9555 0.9505 0.9931
0.0003 23.0 16675 0.0474 0.9530 0.9555 0.9543 0.9938
0.0006 24.0 17400 0.0481 0.9531 0.9538 0.9534 0.9937
0.0013 25.0 18125 0.0502 0.9467 0.9534 0.9500 0.9934
0.0001 26.0 18850 0.0517 0.9461 0.9492 0.9476 0.9933
0.0001 27.0 19575 0.0410 0.9536 0.9530 0.9533 0.9937
0.0011 28.0 20300 0.0453 0.9520 0.9498 0.9509 0.9937
0.0007 29.0 21025 0.0444 0.9479 0.9480 0.9479 0.9935
0.0 30.0 21750 0.0498 0.9529 0.9498 0.9513 0.9937
0.0001 31.0 22475 0.0490 0.9514 0.9496 0.9505 0.9935
0.001 32.0 23200 0.0499 0.9495 0.9486 0.9491 0.9934
0.0001 33.0 23925 0.0451 0.9499 0.9557 0.9528 0.9939
0.0002 34.0 24650 0.0469 0.9486 0.9563 0.9525 0.9937
0.0001 35.0 25375 0.0505 0.9568 0.9496 0.9532 0.9938
0.0003 36.0 26100 0.0491 0.9593 0.9525 0.9559 0.9942
0.0005 37.0 26825 0.0432 0.9551 0.9532 0.9542 0.9939
0.0003 38.0 27550 0.0465 0.9536 0.9486 0.9511 0.9937
0.0019 39.0 28275 0.0491 0.9574 0.9469 0.9521 0.9937
0.0 40.0 29000 0.0470 0.9582 0.9534 0.9558 0.9940
0.0008 41.0 29725 0.0477 0.9505 0.9538 0.9522 0.9937
0.0 42.0 30450 0.0544 0.9500 0.9542 0.9521 0.9937
0.0002 43.0 31175 0.0527 0.9571 0.9492 0.9531 0.9938
0.0005 44.0 31900 0.0510 0.9574 0.9513 0.9543 0.9939
0.0006 45.0 32625 0.0478 0.9527 0.9536 0.9532 0.9938
0.0001 46.0 33350 0.0464 0.9559 0.9517 0.9538 0.9937
0.0001 47.0 34075 0.0478 0.9578 0.9530 0.9554 0.9939
0.0 48.0 34800 0.0507 0.9574 0.9515 0.9544 0.9940
0.0 49.0 35525 0.0534 0.9531 0.9534 0.9532 0.9939
0.0004 50.0 36250 0.0512 0.9541 0.9530 0.9536 0.9941
0.0001 51.0 36975 0.0478 0.9549 0.9532 0.9541 0.9940
0.0001 52.0 37700 0.0446 0.9541 0.9555 0.9548 0.9942
0.0 53.0 38425 0.0522 0.9529 0.9509 0.9519 0.9935
0.0001 54.0 39150 0.0507 0.9552 0.9525 0.9538 0.9937
0.0003 55.0 39875 0.0493 0.9466 0.9484 0.9475 0.9930
0.0 56.0 40600 0.0496 0.9507 0.9496 0.9501 0.9934
0.0 57.0 41325 0.0502 0.9512 0.9559 0.9535 0.9940
0.0 58.0 42050 0.0528 0.9465 0.9525 0.9494 0.9932
0.0 59.0 42775 0.0578 0.9480 0.9503 0.9492 0.9931
0.0 60.0 43500 0.0557 0.9506 0.9486 0.9496 0.9935
0.0 61.0 44225 0.0487 0.9539 0.9521 0.9530 0.9936
0.0 62.0 44950 0.0519 0.9534 0.9536 0.9535 0.9938
0.0 63.0 45675 0.0532 0.9531 0.9554 0.9542 0.9939
0.0 64.0 46400 0.0572 0.9534 0.9527 0.9530 0.9938
0.0001 65.0 47125 0.0563 0.9550 0.9527 0.9538 0.9940
0.0 66.0 47850 0.0550 0.9568 0.9507 0.9538 0.9940
0.0 67.0 48575 0.0585 0.9480 0.9542 0.9511 0.9935
0.0003 68.0 49300 0.0607 0.9501 0.9496 0.9499 0.9936
0.0 69.0 50025 0.0577 0.9529 0.9548 0.9539 0.9939
0.0 70.0 50750 0.0583 0.9541 0.9569 0.9555 0.9941
0.0001 71.0 51475 0.0549 0.9530 0.9486 0.9508 0.9938
0.0 72.0 52200 0.0592 0.9546 0.9509 0.9528 0.9937
0.0 73.0 52925 0.0598 0.9524 0.9502 0.9513 0.9936
0.0 74.0 53650 0.0583 0.9530 0.9517 0.9523 0.9937
0.0 75.0 54375 0.0602 0.9513 0.9513 0.9513 0.9936
0.0 76.0 55100 0.0624 0.9510 0.9527 0.9518 0.9934
0.0 77.0 55825 0.0622 0.9523 0.9527 0.9525 0.9935
0.0 78.0 56550 0.0599 0.9509 0.9536 0.9522 0.9938
0.0 79.0 57275 0.0599 0.9509 0.9550 0.9529 0.9937
0.0 80.0 58000 0.0588 0.9551 0.9536 0.9544 0.9939
0.0 81.0 58725 0.0581 0.9547 0.9561 0.9554 0.9941
0.0 82.0 59450 0.0587 0.9574 0.9567 0.9571 0.9940
0.0 83.0 60175 0.0592 0.9533 0.9582 0.9558 0.9940
0.0 84.0 60900 0.0602 0.9534 0.9569 0.9551 0.9939
0.0 85.0 61625 0.0601 0.9530 0.9554 0.9542 0.9938
0.0 86.0 62350 0.0608 0.9528 0.9561 0.9545 0.9939
0.0 87.0 63075 0.0606 0.9560 0.9538 0.9549 0.9939
0.0 88.0 63800 0.0590 0.9514 0.9575 0.9544 0.9940
0.0 89.0 64525 0.0611 0.9542 0.9577 0.9559 0.9940
0.0002 90.0 65250 0.0617 0.9563 0.9567 0.9565 0.9940
0.0 91.0 65975 0.0611 0.9578 0.9555 0.9566 0.9940
0.0004 92.0 66700 0.0628 0.9510 0.9567 0.9539 0.9939
0.0 93.0 67425 0.0634 0.9523 0.9561 0.9542 0.9939
0.0 94.0 68150 0.0629 0.9534 0.9571 0.9552 0.9940
0.0 95.0 68875 0.0627 0.9523 0.9565 0.9544 0.9940
0.0 96.0 69600 0.0627 0.9528 0.9565 0.9547 0.9940
0.0 97.0 70325 0.0625 0.9536 0.9565 0.9550 0.9940
0.0 98.0 71050 0.0620 0.9558 0.9561 0.9559 0.9941
0.0 99.0 71775 0.0620 0.9543 0.9573 0.9558 0.9940
0.0 100.0 72500 0.0621 0.9538 0.9573 0.9555 0.9940

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.3.0+cu118
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
105
Safetensors
Model size
324M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for PassbyGrocer/hreb-msra

Finetuned
(9)
this model