ko_en_mt_nllb-200-distilled-1.3B
This model is a fine-tuned version of facebook/nllb-200-distilled-1.3B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.3034
- Bleu: 0.4128
- Gen Len: 26.582
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 48
- eval_batch_size: 48
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 192
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
---|---|---|---|---|---|
0.4745 | 0.8421 | 500 | 0.4221 | 0.3503 | 26.1791 |
0.3114 | 1.6838 | 1000 | 0.3016 | 0.3943 | 26.5531 |
0.2625 | 2.5255 | 1500 | 0.2932 | 0.4058 | 26.6861 |
0.2426 | 3.3672 | 2000 | 0.2910 | 0.4056 | 26.5456 |
0.2203 | 4.2088 | 2500 | 0.2924 | 0.4093 | 26.5353 |
0.2011 | 5.0505 | 3000 | 0.2942 | 0.4113 | 26.5593 |
0.2094 | 5.8926 | 3500 | 0.2927 | 0.4147 | 26.6262 |
0.1882 | 6.7343 | 4000 | 0.2960 | 0.4111 | 26.5519 |
0.1867 | 7.576 | 4500 | 0.2981 | 0.4134 | 26.571 |
0.1799 | 8.4177 | 5000 | 0.3015 | 0.4132 | 26.5739 |
0.1713 | 9.2594 | 5500 | 0.3034 | 0.4128 | 26.582 |
Framework versions
- Transformers 4.47.0
- Pytorch 2.5.1+cu124
- Datasets 3.1.0
- Tokenizers 0.21.0
- Downloads last month
- 98
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for ryusangwon/ko_en_mt_nllb-200-distilled-1.3B
Base model
facebook/nllb-200-distilled-1.3B