bert_bilstm_dst_crf-ner-weibo

This model is a fine-tuned version of google-bert/bert-base-chinese on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2064
  • Precision: 0.6286
  • Recall: 0.7224
  • F1: 0.6722
  • Accuracy: 0.9691

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
0.4101 1.0 22 0.3430 0.0 0.0 0.0 0.9330
0.2448 2.0 44 0.1469 0.5153 0.4756 0.4947 0.9626
0.138 3.0 66 0.1119 0.5918 0.7044 0.6432 0.9715
0.0899 4.0 88 0.1064 0.5565 0.6967 0.6187 0.9699
0.0616 5.0 110 0.1064 0.5978 0.6915 0.6412 0.9716
0.0553 6.0 132 0.1112 0.6078 0.6812 0.6424 0.9702
0.0396 7.0 154 0.1165 0.6366 0.7249 0.6779 0.9705
0.0343 8.0 176 0.1204 0.6208 0.7069 0.6611 0.9689
0.0274 9.0 198 0.1365 0.6191 0.7481 0.6775 0.9674
0.0291 10.0 220 0.1403 0.6288 0.6838 0.6552 0.9689
0.0199 11.0 242 0.1415 0.6330 0.7095 0.6691 0.9688
0.0204 12.0 264 0.1447 0.5979 0.7224 0.6542 0.9685
0.0162 13.0 286 0.1499 0.5822 0.7378 0.6508 0.9669
0.0163 14.0 308 0.1441 0.6138 0.7069 0.6571 0.9691
0.0156 15.0 330 0.1543 0.6157 0.7044 0.6571 0.9678
0.0107 16.0 352 0.1546 0.5957 0.7121 0.6487 0.9673
0.0134 17.0 374 0.1558 0.5860 0.7095 0.6419 0.9654
0.0103 18.0 396 0.1557 0.6030 0.7147 0.6541 0.9669
0.0087 19.0 418 0.1596 0.6031 0.6915 0.6443 0.9665
0.0094 20.0 440 0.1568 0.6105 0.6889 0.6473 0.9683
0.0106 21.0 462 0.1547 0.6561 0.6915 0.6733 0.9696
0.0088 22.0 484 0.1627 0.6483 0.6967 0.6716 0.9696
0.0077 23.0 506 0.1628 0.6059 0.7429 0.6674 0.9669
0.0076 24.0 528 0.1695 0.6174 0.6761 0.6454 0.9660
0.0081 25.0 550 0.1644 0.6387 0.7044 0.6699 0.9690
0.0066 26.0 572 0.1674 0.6225 0.7121 0.6643 0.9684
0.0067 27.0 594 0.1640 0.6281 0.7121 0.6675 0.9691
0.0065 28.0 616 0.1693 0.6091 0.7249 0.6620 0.9672
0.0063 29.0 638 0.1737 0.6299 0.7044 0.6650 0.9688
0.0141 30.0 660 0.1772 0.6205 0.7147 0.6643 0.9673
0.0064 31.0 682 0.1817 0.6233 0.7275 0.6714 0.9685
0.0082 32.0 704 0.1704 0.6392 0.6967 0.6667 0.9689
0.0051 33.0 726 0.1663 0.6236 0.7069 0.6627 0.9678
0.0041 34.0 748 0.1767 0.6278 0.7198 0.6707 0.9676
0.0053 35.0 770 0.1749 0.6529 0.6915 0.6717 0.9687
0.0066 36.0 792 0.1810 0.6382 0.7121 0.6731 0.9677
0.0044 37.0 814 0.1721 0.6351 0.7069 0.6691 0.9683
0.0043 38.0 836 0.1833 0.6283 0.7301 0.6754 0.9683
0.0047 39.0 858 0.1862 0.6176 0.7224 0.6659 0.9676
0.0038 40.0 880 0.1826 0.6106 0.7095 0.6564 0.9677
0.0045 41.0 902 0.1888 0.6069 0.7224 0.6596 0.9674
0.004 42.0 924 0.1862 0.6180 0.7069 0.6595 0.9682
0.0054 43.0 946 0.1903 0.6 0.7095 0.6502 0.9674
0.0052 44.0 968 0.1838 0.6379 0.7018 0.6683 0.9680
0.004 45.0 990 0.1850 0.6114 0.7198 0.6612 0.9676
0.0051 46.0 1012 0.1830 0.6412 0.7121 0.6748 0.9683
0.0045 47.0 1034 0.1939 0.6134 0.7301 0.6667 0.9683
0.0039 48.0 1056 0.1876 0.6559 0.6812 0.6683 0.9689
0.0041 49.0 1078 0.1904 0.6188 0.7095 0.6611 0.9675
0.0039 50.0 1100 0.1848 0.6242 0.7172 0.6675 0.9681
0.0043 51.0 1122 0.1823 0.6288 0.6967 0.6610 0.9685
0.0041 52.0 1144 0.1951 0.6137 0.7147 0.6603 0.9677
0.004 53.0 1166 0.1878 0.6026 0.7095 0.6517 0.9678
0.0047 54.0 1188 0.1843 0.6247 0.6889 0.6553 0.9687
0.0042 55.0 1210 0.1947 0.6132 0.7172 0.6611 0.9685
0.0039 56.0 1232 0.1902 0.6330 0.7095 0.6691 0.9690
0.0038 57.0 1254 0.1915 0.6339 0.7121 0.6707 0.9691
0.0035 58.0 1276 0.1887 0.6264 0.7198 0.6699 0.9686
0.0044 59.0 1298 0.1907 0.6247 0.7147 0.6667 0.9686
0.0026 60.0 1320 0.1927 0.6362 0.7147 0.6731 0.9687
0.004 61.0 1342 0.1904 0.6374 0.7095 0.6715 0.9689
0.0041 62.0 1364 0.1914 0.6222 0.7198 0.6675 0.9681
0.0037 63.0 1386 0.1878 0.6298 0.7172 0.6707 0.9684
0.0042 64.0 1408 0.1934 0.6074 0.7198 0.6588 0.9674
0.0047 65.0 1430 0.1992 0.6092 0.7172 0.6588 0.9676
0.0042 66.0 1452 0.1968 0.6186 0.7172 0.6643 0.9679
0.0038 67.0 1474 0.1970 0.6189 0.7224 0.6667 0.9683
0.0033 68.0 1496 0.1976 0.6173 0.7172 0.6635 0.9680
0.0037 69.0 1518 0.1983 0.6247 0.7147 0.6667 0.9684
0.0037 70.0 1540 0.1955 0.6247 0.7147 0.6667 0.9685
0.0038 71.0 1562 0.1970 0.6290 0.7147 0.6691 0.9682
0.0034 72.0 1584 0.2001 0.6242 0.7172 0.6675 0.9681
0.0039 73.0 1606 0.2023 0.6293 0.7069 0.6659 0.9676
0.0027 74.0 1628 0.2003 0.6381 0.7069 0.6707 0.9685
0.0037 75.0 1650 0.2009 0.6203 0.7224 0.6675 0.9683
0.0039 76.0 1672 0.2017 0.6275 0.7147 0.6683 0.9687
0.0035 77.0 1694 0.2016 0.6166 0.7275 0.6675 0.9688
0.0034 78.0 1716 0.2031 0.6108 0.7301 0.6651 0.9687
0.0028 79.0 1738 0.2029 0.6116 0.7326 0.6667 0.9682
0.003 80.0 1760 0.2036 0.6233 0.7275 0.6714 0.9683
0.0038 81.0 1782 0.2063 0.6303 0.7275 0.6754 0.9676
0.0042 82.0 1804 0.2040 0.6378 0.7198 0.6763 0.9685
0.0035 83.0 1826 0.2023 0.6149 0.7224 0.6643 0.9681
0.0033 84.0 1848 0.1991 0.6335 0.7198 0.6739 0.9685
0.0043 85.0 1870 0.2013 0.6306 0.7198 0.6723 0.9686
0.0036 86.0 1892 0.1988 0.6364 0.7018 0.6675 0.9694
0.0037 87.0 1914 0.2041 0.6217 0.7224 0.6683 0.9689
0.0031 88.0 1936 0.2043 0.6231 0.7224 0.6690 0.9689
0.0027 89.0 1958 0.2041 0.625 0.7198 0.6691 0.9688
0.0026 90.0 1980 0.2053 0.6284 0.7172 0.6699 0.9691
0.0031 91.0 2002 0.2049 0.6306 0.7198 0.6723 0.9690
0.003 92.0 2024 0.2056 0.6315 0.7224 0.6739 0.9687
0.0028 93.0 2046 0.2066 0.6149 0.7224 0.6643 0.9684
0.0031 94.0 2068 0.2075 0.6135 0.7224 0.6635 0.9684
0.0038 95.0 2090 0.2070 0.6198 0.7249 0.6682 0.9685
0.003 96.0 2112 0.2063 0.6253 0.7249 0.6714 0.9689
0.0028 97.0 2134 0.2062 0.6275 0.7275 0.6738 0.9692
0.0031 98.0 2156 0.2063 0.6272 0.7224 0.6714 0.9692
0.0026 99.0 2178 0.2062 0.6286 0.7224 0.6722 0.9691
0.002 100.0 2200 0.2064 0.6286 0.7224 0.6722 0.9691

Framework versions

  • Transformers 4.46.1
  • Pytorch 2.4.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.2
Downloads last month
110
Safetensors
Model size
102M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for PassbyGrocer/bert_bilstm_dst_crf-ner-weibo

Finetuned
(156)
this model