metadata

license: cc-by-nc-4.0
base_model: facebook/nllb-200-3.3B
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: nllb-200-3.3B-finetuned
    results: []

nllb-200-3.3B-finetuned

This model is a fine-tuned version of facebook/nllb-200-3.3B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.4525
Rouge: 0.0357
Gen Len: 24.5

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
training_steps: 5000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge	Gen Len
4.34	500.0	500	2.9012	0.05	26.0
0.609	1000.0	1000	1.1881	0.098	26.0
0.0196	1500.0	1500	1.5325	0.1493	30.5
0.0061	2000.0	2000	1.5448	0.125	33.5
0.0036	2500.0	2500	1.5625	0.125	33.5
0.0025	3000.0	3000	1.5641	0.125	33.5
0.002	3500.0	3500	1.5626	0.125	33.5
0.0017	4000.0	4000	1.4340	0.0357	24.5
0.0016	4500.0	4500	1.4486	0.0357	24.5
0.0016	5000.0	5000	1.4525	0.0357	24.5

Framework versions

Transformers 4.39.2
Pytorch 2.2.2+cu121
Datasets 2.21.0
Tokenizers 0.15.2