kreasof-ai
/

nllb-200-600M-eng2bem

Text Generation

text2text-generation

Generated from Trainer

Model card Files Files and versions

cobrayyxx commited on Apr 8

Commit

e0967a6

·

verified ·

1 Parent(s): 1ad5f4c

End of training

Files changed (2) hide show

README.md +19 -10
generation_config.json +1 -1

README.md CHANGED Viewed

@@ -4,6 +4,9 @@ license: cc-by-nc-4.0
 base_model: facebook/nllb-200-distilled-600M
 tags:
 - generated_from_trainer
 model-index:
 - name: nllb-200-distilled-600M-en2bem
   results: []
@@ -15,6 +18,11 @@ should probably proofread and complete it, then remove this comment. -->
 # nllb-200-distilled-600M-en2bem
 This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on the None dataset.
 ## Model description
@@ -37,22 +45,23 @@ The following hyperparameters were used during training:
 - train_batch_size: 16
 - eval_batch_size: 8
 - seed: 42
-- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.03
-- num_epochs: 1
-- mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Bleu | Chrf | Wer      |
-|:-------------:|:-----:|:----:|:---------------:|:----:|:----:|:--------:|
-| No log        | 1.0   | 1    | 12.8183         | 0.52 | 8.65 | 117.1429 |
 ### Framework versions
-- Transformers 4.50.3
-- Pytorch 2.6.0+cu124
-- Datasets 3.5.0
-- Tokenizers 0.21.1

 base_model: facebook/nllb-200-distilled-600M
 tags:
 - generated_from_trainer
+metrics:
+- bleu
+- wer
 model-index:
 - name: nllb-200-distilled-600M-en2bem
   results: []
 # nllb-200-distilled-600M-en2bem
 This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.3204
+- Bleu: 8.51
+- Chrf: 48.32
+- Wer: 83.1036
 ## Model description
 - train_batch_size: 16
 - eval_batch_size: 8
 - seed: 42
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.03
+- num_epochs: 3
 ### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Bleu | Chrf  | Wer     |
+|:-------------:|:-----:|:-----:|:---------------:|:----:|:-----:|:-------:|
+| 0.2594        | 1.0   | 5240  | 0.3208          | 7.99 | 47.42 | 83.9565 |
+| 0.2469        | 2.0   | 10480 | 0.3169          | 8.08 | 47.92 | 83.4161 |
+| 0.2148        | 3.0   | 15720 | 0.3204          | 8.51 | 48.32 | 83.1036 |
 ### Framework versions
+- Transformers 4.47.1
+- Pytorch 2.5.1+cu121
+- Datasets 3.4.0
+- Tokenizers 0.21.0

generation_config.json CHANGED Viewed

@@ -4,5 +4,5 @@
   "eos_token_id": 2,
   "max_length": 200,
   "pad_token_id": 1,
-  "transformers_version": "4.50.3"
 }

   "eos_token_id": 2,
   "max_length": 200,
   "pad_token_id": 1,
+  "transformers_version": "4.47.1"
 }