cobrayyxx commited on
Commit
e0967a6
·
verified ·
1 Parent(s): 1ad5f4c

End of training

Browse files
Files changed (2) hide show
  1. README.md +19 -10
  2. generation_config.json +1 -1
README.md CHANGED
@@ -4,6 +4,9 @@ license: cc-by-nc-4.0
4
  base_model: facebook/nllb-200-distilled-600M
5
  tags:
6
  - generated_from_trainer
 
 
 
7
  model-index:
8
  - name: nllb-200-distilled-600M-en2bem
9
  results: []
@@ -15,6 +18,11 @@ should probably proofread and complete it, then remove this comment. -->
15
  # nllb-200-distilled-600M-en2bem
16
 
17
  This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on the None dataset.
 
 
 
 
 
18
 
19
  ## Model description
20
 
@@ -37,22 +45,23 @@ The following hyperparameters were used during training:
37
  - train_batch_size: 16
38
  - eval_batch_size: 8
39
  - seed: 42
40
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
41
  - lr_scheduler_type: linear
42
  - lr_scheduler_warmup_ratio: 0.03
43
- - num_epochs: 1
44
- - mixed_precision_training: Native AMP
45
 
46
  ### Training results
47
 
48
- | Training Loss | Epoch | Step | Validation Loss | Bleu | Chrf | Wer |
49
- |:-------------:|:-----:|:----:|:---------------:|:----:|:----:|:--------:|
50
- | No log | 1.0 | 1 | 12.8183 | 0.52 | 8.65 | 117.1429 |
 
 
51
 
52
 
53
  ### Framework versions
54
 
55
- - Transformers 4.50.3
56
- - Pytorch 2.6.0+cu124
57
- - Datasets 3.5.0
58
- - Tokenizers 0.21.1
 
4
  base_model: facebook/nllb-200-distilled-600M
5
  tags:
6
  - generated_from_trainer
7
+ metrics:
8
+ - bleu
9
+ - wer
10
  model-index:
11
  - name: nllb-200-distilled-600M-en2bem
12
  results: []
 
18
  # nllb-200-distilled-600M-en2bem
19
 
20
  This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on the None dataset.
21
+ It achieves the following results on the evaluation set:
22
+ - Loss: 0.3204
23
+ - Bleu: 8.51
24
+ - Chrf: 48.32
25
+ - Wer: 83.1036
26
 
27
  ## Model description
28
 
 
45
  - train_batch_size: 16
46
  - eval_batch_size: 8
47
  - seed: 42
48
+ - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
49
  - lr_scheduler_type: linear
50
  - lr_scheduler_warmup_ratio: 0.03
51
+ - num_epochs: 3
 
52
 
53
  ### Training results
54
 
55
+ | Training Loss | Epoch | Step | Validation Loss | Bleu | Chrf | Wer |
56
+ |:-------------:|:-----:|:-----:|:---------------:|:----:|:-----:|:-------:|
57
+ | 0.2594 | 1.0 | 5240 | 0.3208 | 7.99 | 47.42 | 83.9565 |
58
+ | 0.2469 | 2.0 | 10480 | 0.3169 | 8.08 | 47.92 | 83.4161 |
59
+ | 0.2148 | 3.0 | 15720 | 0.3204 | 8.51 | 48.32 | 83.1036 |
60
 
61
 
62
  ### Framework versions
63
 
64
+ - Transformers 4.47.1
65
+ - Pytorch 2.5.1+cu121
66
+ - Datasets 3.4.0
67
+ - Tokenizers 0.21.0
generation_config.json CHANGED
@@ -4,5 +4,5 @@
4
  "eos_token_id": 2,
5
  "max_length": 200,
6
  "pad_token_id": 1,
7
- "transformers_version": "4.50.3"
8
  }
 
4
  "eos_token_id": 2,
5
  "max_length": 200,
6
  "pad_token_id": 1,
7
+ "transformers_version": "4.47.1"
8
  }