File size: 2,570 Bytes
d2bb298 d832f16 8d86c59 d832f16 8d86c59 d832f16 8d86c59 d832f16 8d86c59 d832f16 8d86c59 d832f16 8d86c59 d832f16 8d86c59 d832f16 8d86c59 d2bb298 d832f16 d2bb298 048f3ce d2bb298 048f3ce d832f16 d2bb298 d832f16 d2bb298 d832f16 d2bb298 d832f16 d2bb298 d832f16 d2bb298 d832f16 d2bb298 d832f16 d2bb298 d832f16 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
---
base_model: microsoft/mdeberta-v3-base
datasets:
- eriktks/conll2003
language:
- en
library_name: transformers
license: mit
metrics:
- precision
- recall
- f1
- accuracy
pipeline_tag: token-classification
tags:
- generated_from_trainer
model-index:
- name: mdeberta
results:
- task:
type: token-classification
name: Token Classification
dataset:
name: eriktks/conll2003
type: eriktks/conll2003
config: conll2003
split: validation
args: conll2003
metrics:
- type: precision
value: 0.9566232899566233
name: Precision
- type: recall
value: 0.9649949511948839
name: Recall
- type: f1
value: 0.9607908847184986
name: F1
- type: accuracy
value: 0.9929130485572991
name: Accuracy
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# mdeberta-v3-base-conll2003-en
This model is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) on the [eriktks/conll2003](https://huggingface.co/datasets/eriktks/conll2003) dataset (English split of the CONLL 2003).
It achieves the following results on the evaluation set:
- Loss: 0.0342
- Precision: 0.9566
- Recall: 0.9650
- F1: 0.9608
- Accuracy: 0.9929
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5.0
- mixed_precision_training: Native AMP
### Training results
| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
| No log | 1.0 | 439 | 0.0509 | 0.9303 | 0.9456 | 0.9379 | 0.9890 |
| 0.1482 | 2.0 | 878 | 0.0359 | 0.9501 | 0.9583 | 0.9542 | 0.9918 |
| 0.0335 | 3.0 | 1317 | 0.0338 | 0.9530 | 0.9615 | 0.9572 | 0.9924 |
| 0.0191 | 4.0 | 1756 | 0.0346 | 0.9538 | 0.9635 | 0.9586 | 0.9926 |
| 0.0137 | 5.0 | 2195 | 0.0342 | 0.9566 | 0.9650 | 0.9608 | 0.9929 |
### Framework versions
- Transformers 4.44.2
- Pytorch 2.4.1+cu121
- Datasets 3.0.1
- Tokenizers 0.19.1 |