asahi417 commited on
Commit
f996682
·
1 Parent(s): 015936b

model update

Browse files
README.md CHANGED
@@ -18,31 +18,31 @@ model-index:
18
  metrics:
19
  - name: F1
20
  type: f1
21
- value: 0.6430868167202574
22
  - name: Precision
23
  type: precision
24
- value: 0.6578947368421053
25
  - name: Recall
26
  type: recall
27
- value: 0.6289308176100629
28
  - name: F1 (macro)
29
  type: f1_macro
30
- value: 0.37234464254803534
31
  - name: Precision (macro)
32
  type: precision_macro
33
- value: 0.3758815642868512
34
  - name: Recall (macro)
35
  type: recall_macro
36
- value: 0.3836106023606024
37
  - name: F1 (entity span)
38
  type: f1_entity_span
39
- value: 0.6883116883116883
40
  - name: Precision (entity span)
41
  type: precision_entity_span
42
- value: 0.7043189368770764
43
  - name: Recall (entity span)
44
  type: recall_entity_span
45
- value: 0.6730158730158731
46
 
47
  pipeline_tag: token-classification
48
  widget:
@@ -55,26 +55,26 @@ This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggi
55
  [tner/fin](https://huggingface.co/datasets/tner/fin) dataset.
56
  Model fine-tuning is done via [T-NER](https://github.com/asahi417/tner)'s hyper-parameter search (see the repository
57
  for more detail). It achieves the following results on the test set:
58
- - F1 (micro): 0.6430868167202574
59
- - Precision (micro): 0.6578947368421053
60
- - Recall (micro): 0.6289308176100629
61
- - F1 (macro): 0.37234464254803534
62
- - Precision (macro): 0.3758815642868512
63
- - Recall (macro): 0.3836106023606024
64
 
65
  The per-entity breakdown of the F1 score on the test set are below:
66
- - LOC: nan
67
- - MISC: nan
68
- - ORG: nan
69
- - PER: nan
70
 
71
  For F1 scores, the confidence interval is obtained by bootstrap as below:
72
  - F1 (micro):
73
- - 90%: [0.5722111059165758, 0.7112704135498799]
74
- - 95%: [0.557944362785127, 0.725353903079494]
75
  - F1 (macro):
76
- - 90%: [0.5722111059165758, 0.7112704135498799]
77
- - 95%: [0.557944362785127, 0.725353903079494]
78
 
79
  Full evaluation can be found at [metric file of NER](https://huggingface.co/tner/deberta-v3-large-fin/raw/main/eval/metric.json)
80
  and [metric file of entity span](https://huggingface.co/tner/deberta-v3-large-fin/raw/main/eval/metric_span.json).
@@ -100,14 +100,14 @@ The following hyperparameters were used during training:
100
  - dataset_name: None
101
  - local_dataset: None
102
  - model: microsoft/deberta-v3-large
103
- - crf: False
104
  - max_length: 128
105
- - epoch: 17
106
  - batch_size: 16
107
  - lr: 1e-05
108
  - random_seed: 42
109
  - gradient_accumulation_steps: 4
110
- - weight_decay: 1e-07
111
  - lr_warmup_step_ratio: 0.1
112
  - max_grad_norm: 10.0
113
 
 
18
  metrics:
19
  - name: F1
20
  type: f1
21
+ value: 0.7060755336617406
22
  - name: Precision
23
  type: precision
24
+ value: 0.738831615120275
25
  - name: Recall
26
  type: recall
27
+ value: 0.6761006289308176
28
  - name: F1 (macro)
29
  type: f1_macro
30
+ value: 0.45092058848834204
31
  - name: Precision (macro)
32
  type: precision_macro
33
+ value: 0.45426465258085835
34
  - name: Recall (macro)
35
  type: recall_macro
36
+ value: 0.45582773707773705
37
  - name: F1 (entity span)
38
  type: f1_entity_span
39
+ value: 0.7293729372937293
40
  - name: Precision (entity span)
41
  type: precision_entity_span
42
+ value: 0.7594501718213058
43
  - name: Recall (entity span)
44
  type: recall_entity_span
45
+ value: 0.7015873015873015
46
 
47
  pipeline_tag: token-classification
48
  widget:
 
55
  [tner/fin](https://huggingface.co/datasets/tner/fin) dataset.
56
  Model fine-tuning is done via [T-NER](https://github.com/asahi417/tner)'s hyper-parameter search (see the repository
57
  for more detail). It achieves the following results on the test set:
58
+ - F1 (micro): 0.7060755336617406
59
+ - Precision (micro): 0.738831615120275
60
+ - Recall (micro): 0.6761006289308176
61
+ - F1 (macro): 0.45092058848834204
62
+ - Precision (macro): 0.45426465258085835
63
+ - Recall (macro): 0.45582773707773705
64
 
65
  The per-entity breakdown of the F1 score on the test set are below:
66
+ - location: 0.4000000000000001
67
+ - organization: 0.5762711864406779
68
+ - other: 0.0
69
+ - person: 0.8274111675126904
70
 
71
  For F1 scores, the confidence interval is obtained by bootstrap as below:
72
  - F1 (micro):
73
+ - 90%: [0.6370316240330781, 0.7718233002182738]
74
+ - 95%: [0.6236274300363168, 0.7857205513784461]
75
  - F1 (macro):
76
+ - 90%: [0.6370316240330781, 0.7718233002182738]
77
+ - 95%: [0.6236274300363168, 0.7857205513784461]
78
 
79
  Full evaluation can be found at [metric file of NER](https://huggingface.co/tner/deberta-v3-large-fin/raw/main/eval/metric.json)
80
  and [metric file of entity span](https://huggingface.co/tner/deberta-v3-large-fin/raw/main/eval/metric_span.json).
 
100
  - dataset_name: None
101
  - local_dataset: None
102
  - model: microsoft/deberta-v3-large
103
+ - crf: True
104
  - max_length: 128
105
+ - epoch: 15
106
  - batch_size: 16
107
  - lr: 1e-05
108
  - random_seed: 42
109
  - gradient_accumulation_steps: 4
110
+ - weight_decay: None
111
  - lr_warmup_step_ratio: 0.1
112
  - max_grad_norm: 10.0
113
 
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "tner_ckpt/fin_deberta_v3_large/best_model",
3
  "architectures": [
4
  "DebertaV2ForTokenClassification"
5
  ],
 
1
  {
2
+ "_name_or_path": "tner_ckpt/fin_deberta_v3_large/model_rcsnba/epoch_5",
3
  "architectures": [
4
  "DebertaV2ForTokenClassification"
5
  ],
eval/metric.json CHANGED
@@ -1 +1 @@
1
- {"micro/f1": 0.6430868167202574, "micro/f1_ci": {"90": [0.5722111059165758, 0.7112704135498799], "95": [0.557944362785127, 0.725353903079494]}, "micro/recall": 0.6289308176100629, "micro/precision": 0.6578947368421053, "macro/f1": 0.37234464254803534, "macro/f1_ci": {"90": [0.321037444212583, 0.4174222520031422], "95": [0.3126661472561014, 0.4276527473028317]}, "macro/recall": 0.3836106023606024, "macro/precision": 0.3758815642868512, "per_entity_metric": {"LOC": {"f1": NaN, "f1_ci": {"90": [NaN, NaN], "95": [NaN, NaN]}, "precision": 0.0, "recall": 0.0}, "MISC": {"f1": NaN, "f1_ci": {"90": [NaN, NaN], "95": [NaN, NaN]}, "precision": 0.0, "recall": 0.0}, "ORG": {"f1": NaN, "f1_ci": {"90": [NaN, NaN], "95": [NaN, NaN]}, "precision": 0.0, "recall": 0.0}, "PER": {"f1": NaN, "f1_ci": {"90": [NaN, NaN], "95": [NaN, NaN]}, "precision": 0.0, "recall": 0.0}}}
 
1
+ {"micro/f1": 0.7060755336617406, "micro/f1_ci": {"90": [0.6370316240330781, 0.7718233002182738], "95": [0.6236274300363168, 0.7857205513784461]}, "micro/recall": 0.6761006289308176, "micro/precision": 0.738831615120275, "macro/f1": 0.45092058848834204, "macro/f1_ci": {"90": [0.39899778804703784, 0.5011709891949974], "95": [0.3874931369771246, 0.5136520300021123]}, "macro/recall": 0.45582773707773705, "macro/precision": 0.45426465258085835, "per_entity_metric": {"location": {"f1": 0.4000000000000001, "f1_ci": {"90": [0.2857142857142857, 0.5091682785299806], "95": [0.2608695652173913, 0.5263157894736842]}, "precision": 0.35294117647058826, "recall": 0.46153846153846156}, "organization": {"f1": 0.5762711864406779, "f1_ci": {"90": [0.43634996582365004, 0.7079700983894904], "95": [0.4077472341386317, 0.7342135894078278]}, "precision": 0.5483870967741935, "recall": 0.6071428571428571}, "other": {"f1": 0.0, "f1_ci": {"90": [NaN, NaN], "95": [NaN, NaN]}, "precision": 0.0, "recall": 0.0}, "person": {"f1": 0.8274111675126904, "f1_ci": {"90": [0.7651849599675686, 0.8840794949060123], "95": [0.7459896055540471, 0.8967844202898553]}, "precision": 0.9157303370786517, "recall": 0.7546296296296297}}}
eval/metric_span.json CHANGED
@@ -1 +1 @@
1
- {"micro/f1": 0.6883116883116883, "micro/f1_ci": {"90": [0.6137984272716044, 0.757765305655086], "95": [0.604156373368873, 0.7718631178707224]}, "micro/recall": 0.6730158730158731, "micro/precision": 0.7043189368770764, "macro/f1": 0.6883116883116883, "macro/f1_ci": {"90": [0.6137984272716044, 0.757765305655086], "95": [0.604156373368873, 0.7718631178707224]}, "macro/recall": 0.6730158730158731, "macro/precision": 0.7043189368770764}
 
1
+ {"micro/f1": 0.7293729372937293, "micro/f1_ci": {"90": [0.6546727092010601, 0.7960558252427186], "95": [0.6427420490321417, 0.8090595359078592]}, "micro/recall": 0.7015873015873015, "micro/precision": 0.7594501718213058, "macro/f1": 0.7293729372937293, "macro/f1_ci": {"90": [0.6546727092010601, 0.7960558252427186], "95": [0.6427420490321417, 0.8090595359078592]}, "macro/recall": 0.7015873015873015, "macro/precision": 0.7594501718213058}
eval/prediction.validation.json CHANGED
The diff for this file is too large to render. See raw diff
 
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f45adf48b9766bb5a576a605b96bb0325487f0e5ad3848967fc00fd616c9e8c1
3
- size 1736217519
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bad3729608b27d27e70df820e6cc552dbe034d5ed064cbe4ac5c1f6e5a008727
3
+ size 1736223023
tokenizer_config.json CHANGED
@@ -4,7 +4,7 @@
4
  "do_lower_case": false,
5
  "eos_token": "[SEP]",
6
  "mask_token": "[MASK]",
7
- "name_or_path": "tner_ckpt/fin_deberta_v3_large/best_model",
8
  "pad_token": "[PAD]",
9
  "sep_token": "[SEP]",
10
  "sp_model_kwargs": {},
 
4
  "do_lower_case": false,
5
  "eos_token": "[SEP]",
6
  "mask_token": "[MASK]",
7
+ "name_or_path": "tner_ckpt/fin_deberta_v3_large/model_rcsnba/epoch_5",
8
  "pad_token": "[PAD]",
9
  "sep_token": "[SEP]",
10
  "sp_model_kwargs": {},
trainer_config.json CHANGED
@@ -1 +1 @@
1
- {"dataset": ["tner/fin"], "dataset_split": "train", "dataset_name": null, "local_dataset": null, "model": "microsoft/deberta-v3-large", "crf": false, "max_length": 128, "epoch": 17, "batch_size": 16, "lr": 1e-05, "random_seed": 42, "gradient_accumulation_steps": 4, "weight_decay": 1e-07, "lr_warmup_step_ratio": 0.1, "max_grad_norm": 10.0}
 
1
+ {"dataset": ["tner/fin"], "dataset_split": "train", "dataset_name": null, "local_dataset": null, "model": "microsoft/deberta-v3-large", "crf": true, "max_length": 128, "epoch": 15, "batch_size": 16, "lr": 1e-05, "random_seed": 42, "gradient_accumulation_steps": 4, "weight_decay": null, "lr_warmup_step_ratio": 0.1, "max_grad_norm": 10.0}