touno_english_to_taglish

This model is a fine-tuned version of facebook/nllb-200-distilled-1.3B on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
2.3184	1.0	56	1.8555
1.6786	2.0	112	1.3361
1.3304	3.0	168	1.2605
1.18	4.0	224	1.2292
1.1077	5.0	280	1.2118
1.0331	6.0	336	1.2023
0.9891	7.0	392	1.2017
0.9283	8.0	448	1.1988
0.8641	9.0	504	1.2009
0.8241	10.0	560	1.2025
0.8004	11.0	616	1.2061
0.7804	12.0	672	1.2096
0.7278	13.0	728	1.2122
0.7366	14.0	784	1.2182
0.6856	15.0	840	1.2216
0.7073	16.0	896	1.2238
0.6963	17.0	952	1.2241
0.6728	18.0	1008	1.2261
0.6515	19.0	1064	1.2272
0.6776	19.6486	1100	1.2274