metadata
library_name: transformers
language:
- lg
license: apache-2.0
base_model: google-bert/bert-base-multilingual-cased
tags:
- named-entity-recognition
- luganda
- african-languages
- pii-detection
- token-classification
- generated_from_trainer
datasets:
- Beijuka/Luganda_Monolingual_PII_dataset
metrics:
- precision
- recall
- f1
- accuracy
model-index:
- name: luganda-ner-bert-v7
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: Beijuka/Luganda_Monolingual_PII_dataset
type: Beijuka/Luganda_Monolingual_PII_dataset
args: 'split: train+validation+test'
metrics:
- name: Precision
type: precision
value: 0.8368200836820083
- name: Recall
type: recall
value: 0.7944389275074478
- name: F1
type: f1
value: 0.815078960774325
- name: Accuracy
type: accuracy
value: 0.9449490640839062
luganda-ner-bert-v7
This model is a fine-tuned version of google-bert/bert-base-multilingual-cased on the Beijuka/Luganda_Monolingual_PII_dataset dataset. It achieves the following results on the evaluation set:
- Loss: 0.3699
- Precision: 0.8368
- Recall: 0.7944
- F1: 0.8151
- Accuracy: 0.9449
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|
No log | 1.0 | 261 | 0.5128 | 0.5595 | 0.2661 | 0.3607 | 0.8665 |
0.6082 | 2.0 | 522 | 0.3601 | 0.5826 | 0.6127 | 0.5973 | 0.9005 |
0.6082 | 3.0 | 783 | 0.2763 | 0.7691 | 0.6584 | 0.7095 | 0.9262 |
0.2267 | 4.0 | 1044 | 0.2551 | 0.7732 | 0.7279 | 0.7499 | 0.9344 |
0.2267 | 5.0 | 1305 | 0.2874 | 0.7308 | 0.7468 | 0.7387 | 0.9283 |
0.1155 | 6.0 | 1566 | 0.2874 | 0.7549 | 0.7646 | 0.7597 | 0.9364 |
0.1155 | 7.0 | 1827 | 0.2852 | 0.8069 | 0.7716 | 0.7888 | 0.9455 |
0.0605 | 8.0 | 2088 | 0.3063 | 0.7820 | 0.7944 | 0.7882 | 0.9420 |
0.0605 | 9.0 | 2349 | 0.3227 | 0.8022 | 0.7974 | 0.7998 | 0.9427 |
0.0345 | 10.0 | 2610 | 0.3288 | 0.8026 | 0.8034 | 0.8030 | 0.9408 |
0.0345 | 11.0 | 2871 | 0.3661 | 0.8010 | 0.8232 | 0.8119 | 0.9410 |
0.0182 | 12.0 | 3132 | 0.3699 | 0.8368 | 0.7944 | 0.8151 | 0.9449 |
0.0182 | 13.0 | 3393 | 0.3683 | 0.7958 | 0.7974 | 0.7966 | 0.9436 |
0.0133 | 14.0 | 3654 | 0.3811 | 0.8008 | 0.8064 | 0.8036 | 0.9422 |
Framework versions
- Transformers 4.53.0
- Pytorch 2.6.0+cu124
- Datasets 3.6.0
- Tokenizers 0.21.2