ilyes25 commited on
Commit
c0bbcbf
·
verified ·
1 Parent(s): 143cb90

End of training

Browse files
Files changed (3) hide show
  1. README.md +149 -195
  2. adapter.fr.safetensors +3 -0
  3. model.safetensors +1 -1
README.md CHANGED
@@ -1,199 +1,153 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
- # Model Card for Model ID
7
-
8
- <!-- Provide a quick summary of what the model is/does. -->
9
-
10
-
11
-
12
- ## Model Details
13
-
14
- ### Model Description
15
-
16
- <!-- Provide a longer summary of what this model is. -->
17
-
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
-
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
-
28
- ### Model Sources [optional]
29
-
30
- <!-- Provide the basic links for the model. -->
31
-
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
-
36
- ## Uses
37
-
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
-
40
- ### Direct Use
41
-
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
-
44
- [More Information Needed]
45
-
46
- ### Downstream Use [optional]
47
-
48
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
-
50
- [More Information Needed]
51
-
52
- ### Out-of-Scope Use
53
-
54
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
-
56
- [More Information Needed]
57
-
58
- ## Bias, Risks, and Limitations
59
-
60
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
-
62
- [More Information Needed]
63
-
64
- ### Recommendations
65
-
66
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
-
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
-
70
- ## How to Get Started with the Model
71
-
72
- Use the code below to get started with the model.
73
-
74
- [More Information Needed]
75
-
76
- ## Training Details
77
-
78
- ### Training Data
79
-
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
-
82
- [More Information Needed]
83
-
84
- ### Training Procedure
85
-
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
-
88
- #### Preprocessing [optional]
89
-
90
- [More Information Needed]
91
-
92
-
93
- #### Training Hyperparameters
94
-
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
-
97
- #### Speeds, Sizes, Times [optional]
98
-
99
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
-
101
- [More Information Needed]
102
-
103
- ## Evaluation
104
-
105
- <!-- This section describes the evaluation protocols and provides the results. -->
106
-
107
- ### Testing Data, Factors & Metrics
108
-
109
- #### Testing Data
110
-
111
- <!-- This should link to a Dataset Card if possible. -->
112
-
113
- [More Information Needed]
114
-
115
- #### Factors
116
-
117
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
-
119
- [More Information Needed]
120
-
121
- #### Metrics
122
-
123
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
-
125
- [More Information Needed]
126
-
127
- ### Results
128
-
129
- [More Information Needed]
130
-
131
- #### Summary
132
-
133
-
134
-
135
- ## Model Examination [optional]
136
-
137
- <!-- Relevant interpretability work for the model goes here -->
138
-
139
- [More Information Needed]
140
-
141
- ## Environmental Impact
142
-
143
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
-
145
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
-
147
- - **Hardware Type:** [More Information Needed]
148
- - **Hours used:** [More Information Needed]
149
- - **Cloud Provider:** [More Information Needed]
150
- - **Compute Region:** [More Information Needed]
151
- - **Carbon Emitted:** [More Information Needed]
152
-
153
- ## Technical Specifications [optional]
154
-
155
- ### Model Architecture and Objective
156
-
157
- [More Information Needed]
158
-
159
- ### Compute Infrastructure
160
-
161
- [More Information Needed]
162
-
163
- #### Hardware
164
-
165
- [More Information Needed]
166
-
167
- #### Software
168
-
169
- [More Information Needed]
170
-
171
- ## Citation [optional]
172
-
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
-
175
- **BibTeX:**
176
-
177
- [More Information Needed]
178
-
179
- **APA:**
180
-
181
- [More Information Needed]
182
-
183
- ## Glossary [optional]
184
-
185
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
-
187
- [More Information Needed]
188
-
189
- ## More Information [optional]
190
-
191
- [More Information Needed]
192
-
193
- ## Model Card Authors [optional]
194
-
195
- [More Information Needed]
196
-
197
- ## Model Card Contact
198
-
199
- [More Information Needed]
 
1
  ---
2
  library_name: transformers
3
+ license: cc-by-nc-4.0
4
+ base_model: facebook/mms-1b-all
5
+ tags:
6
+ - generated_from_trainer
7
+ metrics:
8
+ - wer
9
+ - bleu
10
+ - rouge
11
+ model-index:
12
+ - name: frdirect
13
+ results: []
14
  ---
15
 
16
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
+ should probably proofread and complete it, then remove this comment. -->
18
+
19
+ # frdirect
20
+
21
+ This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on an unknown dataset.
22
+ It achieves the following results on the evaluation set:
23
+ - Loss: 0.1592
24
+ - Wer: 0.1081
25
+ - Bleu: 0.7992
26
+ - Rouge: {'rouge1': 0.9142461267493629, 'rouge2': 0.8512223456977356, 'rougeL': 0.9140461108455781, 'rougeLsum': 0.9139759112519872}
27
+
28
+ ## Model description
29
+
30
+ More information needed
31
+
32
+ ## Intended uses & limitations
33
+
34
+ More information needed
35
+
36
+ ## Training and evaluation data
37
+
38
+ More information needed
39
+
40
+ ## Training procedure
41
+
42
+ ### Training hyperparameters
43
+
44
+ The following hyperparameters were used during training:
45
+ - learning_rate: 0.001
46
+ - train_batch_size: 8
47
+ - eval_batch_size: 8
48
+ - seed: 42
49
+ - gradient_accumulation_steps: 4
50
+ - total_train_batch_size: 32
51
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
52
+ - lr_scheduler_type: linear
53
+ - lr_scheduler_warmup_steps: 100
54
+ - num_epochs: 30
55
+ - mixed_precision_training: Native AMP
56
+
57
+ ### Training results
58
+
59
+ | Training Loss | Epoch | Step | Validation Loss | Wer | Bleu | Rouge |
60
+ |:-------------:|:-------:|:----:|:---------------:|:------:|:------:|:---------------------------------------------------------------------------------------------------------------------------:|
61
+ | 6.0797 | 0.3512 | 100 | 0.3918 | 0.2286 | 0.6250 | {'rouge1': 0.7964874009051572, 'rouge2': 0.6786673339200013, 'rougeL': 0.795684539225159, 'rougeLsum': 0.7956009761818327} |
62
+ | 0.4361 | 0.7024 | 200 | 0.3395 | 0.2025 | 0.6572 | {'rouge1': 0.8286329956380034, 'rouge2': 0.7214521206108135, 'rougeL': 0.8280112126425977, 'rougeLsum': 0.8280818081170968} |
63
+ | 0.4069 | 1.0527 | 300 | 0.2683 | 0.1944 | 0.6691 | {'rouge1': 0.8333662643750593, 'rouge2': 0.7283520260829215, 'rougeL': 0.8329211797586267, 'rougeLsum': 0.8326890662231179} |
64
+ | 0.3738 | 1.4039 | 400 | 0.2500 | 0.1832 | 0.6885 | {'rouge1': 0.8475464903860954, 'rouge2': 0.7490364749206575, 'rougeL': 0.8470167743507384, 'rougeLsum': 0.8469648141276026} |
65
+ | 0.3393 | 1.7550 | 500 | 0.2473 | 0.1806 | 0.6872 | {'rouge1': 0.8487702037730754, 'rouge2': 0.7515019894679165, 'rougeL': 0.8482651202728038, 'rougeLsum': 0.8479933284128696} |
66
+ | 0.3337 | 2.1054 | 600 | 0.2341 | 0.1745 | 0.7015 | {'rouge1': 0.856524582926171, 'rouge2': 0.7615922551676065, 'rougeL': 0.8562090285236332, 'rougeLsum': 0.8560778549085171} |
67
+ | 0.3273 | 2.4565 | 700 | 0.2287 | 0.1776 | 0.6934 | {'rouge1': 0.8518960784546261, 'rouge2': 0.7571574767836958, 'rougeL': 0.8512601033375955, 'rougeLsum': 0.8510232422493931} |
68
+ | 0.3212 | 2.8077 | 800 | 0.2195 | 0.1673 | 0.7067 | {'rouge1': 0.8597006641882449, 'rouge2': 0.7675023646180051, 'rougeL': 0.8592257714231155, 'rougeLsum': 0.8593257522509605} |
69
+ | 0.2989 | 3.1580 | 900 | 0.2214 | 0.1633 | 0.7125 | {'rouge1': 0.8652652267566208, 'rouge2': 0.7751359730996655, 'rougeL': 0.8651950850137953, 'rougeLsum': 0.8648147312388779} |
70
+ | 0.2849 | 3.5092 | 1000 | 0.2176 | 0.1610 | 0.7119 | {'rouge1': 0.8674026972599356, 'rouge2': 0.7767934576217497, 'rougeL': 0.8670611483514326, 'rougeLsum': 0.866923394119187} |
71
+ | 0.321 | 3.8604 | 1100 | 0.2140 | 0.1562 | 0.7203 | {'rouge1': 0.8687120531054047, 'rouge2': 0.7801284964910911, 'rougeL': 0.8685687320059696, 'rougeLsum': 0.8686957534611408} |
72
+ | 0.2901 | 4.2107 | 1200 | 0.2092 | 0.1570 | 0.7233 | {'rouge1': 0.870783939334159, 'rouge2': 0.7846870067295553, 'rougeL': 0.8700614974709336, 'rougeLsum': 0.8703440992984535} |
73
+ | 0.2758 | 4.5619 | 1300 | 0.2208 | 0.1678 | 0.7044 | {'rouge1': 0.8627882593824365, 'rouge2': 0.7732336382659819, 'rougeL': 0.8623362277788245, 'rougeLsum': 0.862456577073438} |
74
+ | 0.2802 | 4.9131 | 1400 | 0.2039 | 0.1547 | 0.7258 | {'rouge1': 0.8731727674477189, 'rouge2': 0.7886178130374446, 'rougeL': 0.872672601389526, 'rougeLsum': 0.8726659641042169} |
75
+ | 0.2638 | 5.2634 | 1500 | 0.2043 | 0.1510 | 0.7335 | {'rouge1': 0.8755955637027819, 'rouge2': 0.7930493884188712, 'rougeL': 0.8751631654870777, 'rougeLsum': 0.8751907751029582} |
76
+ | 0.2752 | 5.6146 | 1600 | 0.2055 | 0.1551 | 0.7270 | {'rouge1': 0.872388381525685, 'rouge2': 0.7875275384987104, 'rougeL': 0.8719998038854011, 'rougeLsum': 0.8716380106368946} |
77
+ | 0.2611 | 5.9658 | 1700 | 0.2000 | 0.1470 | 0.7371 | {'rouge1': 0.8788848516546419, 'rouge2': 0.7961419908259184, 'rougeL': 0.8787077158049774, 'rougeLsum': 0.8785400491349351} |
78
+ | 0.2473 | 6.3161 | 1800 | 0.1964 | 0.1480 | 0.7367 | {'rouge1': 0.8780453998988988, 'rouge2': 0.7968768691849546, 'rougeL': 0.877539022180082, 'rougeLsum': 0.8772607424486614} |
79
+ | 0.2595 | 6.6673 | 1900 | 0.2025 | 0.1480 | 0.7381 | {'rouge1': 0.879639846099505, 'rouge2': 0.797600429803611, 'rougeL': 0.8793686789606971, 'rougeLsum': 0.8790549352654082} |
80
+ | 0.2689 | 7.0176 | 2000 | 0.1969 | 0.1432 | 0.7430 | {'rouge1': 0.881797326390697, 'rouge2': 0.8004647695765528, 'rougeL': 0.8813203554835087, 'rougeLsum': 0.8811828285656307} |
81
+ | 0.246 | 7.3687 | 2100 | 0.1963 | 0.1449 | 0.7398 | {'rouge1': 0.8817110807418125, 'rouge2': 0.8017781199834159, 'rougeL': 0.8815737656302565, 'rougeLsum': 0.8813654064210932} |
82
+ | 0.2502 | 7.7199 | 2200 | 0.1925 | 0.1492 | 0.7347 | {'rouge1': 0.8793953293462229, 'rouge2': 0.7995783364307951, 'rougeL': 0.8789946756811035, 'rougeLsum': 0.8792323972067102} |
83
+ | 0.2355 | 8.0702 | 2300 | 0.1912 | 0.1402 | 0.7460 | {'rouge1': 0.8848122766361803, 'rouge2': 0.805493594353921, 'rougeL': 0.8843513031942714, 'rougeLsum': 0.8847078352624693} |
84
+ | 0.2366 | 8.4214 | 2400 | 0.1885 | 0.1412 | 0.7426 | {'rouge1': 0.8840181215146308, 'rouge2': 0.8027935769840546, 'rougeL': 0.8836147367072817, 'rougeLsum': 0.8836426860532026} |
85
+ | 0.2407 | 8.7726 | 2500 | 0.1918 | 0.1447 | 0.7397 | {'rouge1': 0.8821736207296971, 'rouge2': 0.801567427041519, 'rougeL': 0.881521485738069, 'rougeLsum': 0.88187487496196} |
86
+ | 0.2387 | 9.1229 | 2600 | 0.1903 | 0.1334 | 0.7605 | {'rouge1': 0.8897429883784935, 'rouge2': 0.8137161586716353, 'rougeL': 0.8892440561394621, 'rougeLsum': 0.8891549602299966} |
87
+ | 0.2347 | 9.4741 | 2700 | 0.1834 | 0.1345 | 0.7572 | {'rouge1': 0.8881942208247308, 'rouge2': 0.8119581058011768, 'rougeL': 0.8879981514910154, 'rougeLsum': 0.887901612349636} |
88
+ | 0.2271 | 9.8253 | 2800 | 0.1858 | 0.1335 | 0.7590 | {'rouge1': 0.8911506791113586, 'rouge2': 0.8148539957097256, 'rougeL': 0.8905751461168572, 'rougeLsum': 0.890902060551137} |
89
+ | 0.2303 | 10.1756 | 2900 | 0.1858 | 0.1365 | 0.7501 | {'rouge1': 0.887481336474667, 'rouge2': 0.8085757416110948, 'rougeL': 0.887243114877972, 'rougeLsum': 0.8871195261439274} |
90
+ | 0.2284 | 10.5268 | 3000 | 0.1873 | 0.1348 | 0.7543 | {'rouge1': 0.8896704902968307, 'rouge2': 0.8119666309133653, 'rougeL': 0.8895050087347427, 'rougeLsum': 0.8893208588598065} |
91
+ | 0.2177 | 10.8780 | 3100 | 0.1899 | 0.1412 | 0.7429 | {'rouge1': 0.883841093950449, 'rouge2': 0.8027541651011862, 'rougeL': 0.8832016518265267, 'rougeLsum': 0.8831940672343328} |
92
+ | 0.2259 | 11.2283 | 3200 | 0.1841 | 0.1382 | 0.7505 | {'rouge1': 0.8861091671648993, 'rouge2': 0.8076445498910136, 'rougeL': 0.8860146387775966, 'rougeLsum': 0.8856781536323487} |
93
+ | 0.2183 | 11.5795 | 3300 | 0.1803 | 0.1333 | 0.7589 | {'rouge1': 0.8917208112217359, 'rouge2': 0.8180400270633807, 'rougeL': 0.8911859595950976, 'rougeLsum': 0.8913321406932759} |
94
+ | 0.2124 | 11.9306 | 3400 | 0.1826 | 0.1309 | 0.7626 | {'rouge1': 0.8924689635777846, 'rouge2': 0.8186471633950445, 'rougeL': 0.891927412819415, 'rougeLsum': 0.8920422701086449} |
95
+ | 0.1961 | 12.2809 | 3500 | 0.1824 | 0.1300 | 0.7648 | {'rouge1': 0.8947275911029863, 'rouge2': 0.8218776029324886, 'rougeL': 0.8942104883105186, 'rougeLsum': 0.894350829557474} |
96
+ | 0.2121 | 12.6321 | 3600 | 0.1792 | 0.1278 | 0.7649 | {'rouge1': 0.8965227623557459, 'rouge2': 0.8223749938722336, 'rougeL': 0.8961352103229281, 'rougeLsum': 0.8959098417776623} |
97
+ | 0.2087 | 12.9833 | 3700 | 0.1767 | 0.1294 | 0.7648 | {'rouge1': 0.8960785872160766, 'rouge2': 0.8226243103661531, 'rougeL': 0.8959515880736657, 'rougeLsum': 0.8959147567781259} |
98
+ | 0.1943 | 13.3336 | 3800 | 0.1801 | 0.1288 | 0.7644 | {'rouge1': 0.8947704603138941, 'rouge2': 0.8196022891096679, 'rougeL': 0.8945419678439788, 'rougeLsum': 0.8941955883885566} |
99
+ | 0.2053 | 13.6848 | 3900 | 0.1732 | 0.1269 | 0.7682 | {'rouge1': 0.8953389907409508, 'rouge2': 0.8204708771662934, 'rougeL': 0.8942418907803242, 'rougeLsum': 0.8943319307650137} |
100
+ | 0.2196 | 14.0351 | 4000 | 0.1722 | 0.1258 | 0.7704 | {'rouge1': 0.8971096393899529, 'rouge2': 0.8245173641233066, 'rougeL': 0.8968754492659268, 'rougeLsum': 0.8967885844102701} |
101
+ | 0.1996 | 14.3863 | 4100 | 0.1746 | 0.1283 | 0.7663 | {'rouge1': 0.8992149614052899, 'rouge2': 0.8283870278179525, 'rougeL': 0.8990034855026199, 'rougeLsum': 0.899113705236827} |
102
+ | 0.2028 | 14.7375 | 4200 | 0.1723 | 0.1258 | 0.7688 | {'rouge1': 0.8981221304907357, 'rouge2': 0.8268085111954614, 'rougeL': 0.8978730997154603, 'rougeLsum': 0.8978255561942574} |
103
+ | 0.1784 | 15.0878 | 4300 | 0.1741 | 0.1210 | 0.7777 | {'rouge1': 0.9022989643269592, 'rouge2': 0.8321097618769113, 'rougeL': 0.9019679668540621, 'rougeLsum': 0.9020840316410879} |
104
+ | 0.1954 | 15.4390 | 4400 | 0.1746 | 0.1234 | 0.7748 | {'rouge1': 0.8990459493090655, 'rouge2': 0.8287806337458845, 'rougeL': 0.8990074200510402, 'rougeLsum': 0.8988288738757491} |
105
+ | 0.1916 | 15.7902 | 4500 | 0.1719 | 0.1230 | 0.7761 | {'rouge1': 0.900488872562492, 'rouge2': 0.8307065830708865, 'rougeL': 0.9000372599293843, 'rougeLsum': 0.9002263530831652} |
106
+ | 0.1883 | 16.1405 | 4600 | 0.1712 | 0.1226 | 0.7757 | {'rouge1': 0.9019026814628661, 'rouge2': 0.8329976152495208, 'rougeL': 0.9016592566182122, 'rougeLsum': 0.9016282070055117} |
107
+ | 0.1832 | 16.4917 | 4700 | 0.1713 | 0.1248 | 0.7733 | {'rouge1': 0.8995223210908226, 'rouge2': 0.8290222714943427, 'rougeL': 0.8994032458040973, 'rougeLsum': 0.899366836054343} |
108
+ | 0.1888 | 16.8428 | 4800 | 0.1698 | 0.1264 | 0.7721 | {'rouge1': 0.8982535964067325, 'rouge2': 0.8288396477969829, 'rougeL': 0.8977273104751539, 'rougeLsum': 0.8976965343038692} |
109
+ | 0.1857 | 17.1932 | 4900 | 0.1718 | 0.1230 | 0.7757 | {'rouge1': 0.9026836932266615, 'rouge2': 0.833487216136216, 'rougeL': 0.9022265966641445, 'rougeLsum': 0.9024265508303717} |
110
+ | 0.1858 | 17.5443 | 5000 | 0.1705 | 0.1204 | 0.7792 | {'rouge1': 0.9046938392928605, 'rouge2': 0.8378406365404705, 'rougeL': 0.904281513678646, 'rougeLsum': 0.9040395290033556} |
111
+ | 0.1838 | 17.8955 | 5100 | 0.1713 | 0.1222 | 0.7773 | {'rouge1': 0.9025388171823945, 'rouge2': 0.8339417592886358, 'rougeL': 0.901982659834949, 'rougeLsum': 0.902191073170105} |
112
+ | 0.1784 | 18.2458 | 5200 | 0.1710 | 0.1228 | 0.7741 | {'rouge1': 0.9027135961066619, 'rouge2': 0.8338803735375095, 'rougeL': 0.9021313465595333, 'rougeLsum': 0.9022241778623963} |
113
+ | 0.1748 | 18.5970 | 5300 | 0.1700 | 0.1205 | 0.7803 | {'rouge1': 0.9028943462107296, 'rouge2': 0.8360887354751103, 'rougeL': 0.9028049282097476, 'rougeLsum': 0.9026754818313738} |
114
+ | 0.1785 | 18.9482 | 5400 | 0.1683 | 0.1191 | 0.7827 | {'rouge1': 0.9058615541527929, 'rouge2': 0.8397686128782502, 'rougeL': 0.905380338988193, 'rougeLsum': 0.9053818296865667} |
115
+ | 0.1715 | 19.2985 | 5500 | 0.1693 | 0.1197 | 0.7813 | {'rouge1': 0.9042951746231659, 'rouge2': 0.836346216169132, 'rougeL': 0.9038200107993153, 'rougeLsum': 0.9038172163450839} |
116
+ | 0.1743 | 19.6497 | 5600 | 0.1656 | 0.1198 | 0.7820 | {'rouge1': 0.9056265375031829, 'rouge2': 0.8389744056310899, 'rougeL': 0.9052424864181929, 'rougeLsum': 0.9052107898473516} |
117
+ | 0.179 | 20.0 | 5700 | 0.1662 | 0.1200 | 0.7813 | {'rouge1': 0.9049957724375504, 'rouge2': 0.838664745594604, 'rougeL': 0.9047378933290081, 'rougeLsum': 0.9047796552290915} |
118
+ | 0.1705 | 20.3512 | 5800 | 0.1671 | 0.1158 | 0.7875 | {'rouge1': 0.9066988025993499, 'rouge2': 0.84144312190399, 'rougeL': 0.9066516406825784, 'rougeLsum': 0.9064192832523982} |
119
+ | 0.1737 | 20.7024 | 5900 | 0.1668 | 0.1191 | 0.7809 | {'rouge1': 0.9044882408830994, 'rouge2': 0.8379831752266652, 'rougeL': 0.9044191300138034, 'rougeLsum': 0.9041602420355832} |
120
+ | 0.161 | 21.0527 | 6000 | 0.1675 | 0.1176 | 0.7855 | {'rouge1': 0.9057981628991186, 'rouge2': 0.8402890351927816, 'rougeL': 0.9055204210299842, 'rougeLsum': 0.9055949937515819} |
121
+ | 0.1634 | 21.4039 | 6100 | 0.1656 | 0.1172 | 0.7849 | {'rouge1': 0.9052591361441602, 'rouge2': 0.8390947395192179, 'rougeL': 0.904901443457665, 'rougeLsum': 0.9049274938729717} |
122
+ | 0.1717 | 21.7550 | 6200 | 0.1655 | 0.1184 | 0.7850 | {'rouge1': 0.9056855411229652, 'rouge2': 0.8391993079429194, 'rougeL': 0.9051256484322957, 'rougeLsum': 0.9054924942388913} |
123
+ | 0.1532 | 22.1054 | 6300 | 0.1640 | 0.1138 | 0.7895 | {'rouge1': 0.908514689307546, 'rouge2': 0.8432231540949471, 'rougeL': 0.9081112639646672, 'rougeLsum': 0.908272589564951} |
124
+ | 0.167 | 22.4565 | 6400 | 0.1626 | 0.1140 | 0.7921 | {'rouge1': 0.9103164260947302, 'rouge2': 0.847605808196412, 'rougeL': 0.9098933635969182, 'rougeLsum': 0.9098438675661122} |
125
+ | 0.1606 | 22.8077 | 6500 | 0.1632 | 0.1115 | 0.7956 | {'rouge1': 0.9112987341936316, 'rouge2': 0.8492302659397843, 'rougeL': 0.9111470792331513, 'rougeLsum': 0.9109445157457727} |
126
+ | 0.1599 | 23.1580 | 6600 | 0.1642 | 0.1108 | 0.7960 | {'rouge1': 0.9118713013856634, 'rouge2': 0.848580538085876, 'rougeL': 0.9114618012803377, 'rougeLsum': 0.9116046463908325} |
127
+ | 0.156 | 23.5092 | 6700 | 0.1634 | 0.1122 | 0.7950 | {'rouge1': 0.9106394674615637, 'rouge2': 0.8474609350248357, 'rougeL': 0.910065118851396, 'rougeLsum': 0.9100662823980902} |
128
+ | 0.1589 | 23.8604 | 6800 | 0.1625 | 0.1130 | 0.7916 | {'rouge1': 0.9103350470563911, 'rouge2': 0.8474738385027315, 'rougeL': 0.9099995608863516, 'rougeLsum': 0.9100397914660747} |
129
+ | 0.1622 | 24.2107 | 6900 | 0.1626 | 0.1117 | 0.7943 | {'rouge1': 0.9133839350790938, 'rouge2': 0.8511081438025545, 'rougeL': 0.9126575984413424, 'rougeLsum': 0.9126505650621592} |
130
+ | 0.1521 | 24.5619 | 7000 | 0.1618 | 0.1109 | 0.7963 | {'rouge1': 0.912211680613469, 'rouge2': 0.8496181639239891, 'rougeL': 0.9117487343663472, 'rougeLsum': 0.9117499155750092} |
131
+ | 0.1503 | 24.9131 | 7100 | 0.1612 | 0.1115 | 0.7945 | {'rouge1': 0.9119319650927245, 'rouge2': 0.8489304675858942, 'rougeL': 0.9115897952726388, 'rougeLsum': 0.9116405544813904} |
132
+ | 0.1504 | 25.2634 | 7200 | 0.1621 | 0.1103 | 0.7957 | {'rouge1': 0.9121880227389143, 'rouge2': 0.8492463401850738, 'rougeL': 0.9118447435959087, 'rougeLsum': 0.9118033384400608} |
133
+ | 0.1519 | 25.6146 | 7300 | 0.1615 | 0.1118 | 0.7931 | {'rouge1': 0.9112498683244998, 'rouge2': 0.8480832804563686, 'rougeL': 0.9107105272229861, 'rougeLsum': 0.9107937803611704} |
134
+ | 0.1479 | 25.9658 | 7400 | 0.1611 | 0.1098 | 0.7974 | {'rouge1': 0.9136242054251107, 'rouge2': 0.8509392563862166, 'rougeL': 0.9130826085424906, 'rougeLsum': 0.9132200366521013} |
135
+ | 0.1437 | 26.3161 | 7500 | 0.1609 | 0.1099 | 0.7958 | {'rouge1': 0.9130313007924953, 'rouge2': 0.8502201629741937, 'rougeL': 0.9126113638303235, 'rougeLsum': 0.9126553187114919} |
136
+ | 0.148 | 26.6673 | 7600 | 0.1609 | 0.1095 | 0.7968 | {'rouge1': 0.9137297115874194, 'rouge2': 0.8504308144873894, 'rougeL': 0.9133787545300738, 'rougeLsum': 0.9133679379590172} |
137
+ | 0.1568 | 27.0176 | 7700 | 0.1607 | 0.1107 | 0.7945 | {'rouge1': 0.9131462089688931, 'rouge2': 0.8499313234434469, 'rougeL': 0.912728356326638, 'rougeLsum': 0.9129359515643884} |
138
+ | 0.1415 | 27.3687 | 7800 | 0.1607 | 0.1089 | 0.7971 | {'rouge1': 0.9137517387960785, 'rouge2': 0.8501495668327003, 'rougeL': 0.913338099417192, 'rougeLsum': 0.9136064287202272} |
139
+ | 0.1496 | 27.7199 | 7900 | 0.1597 | 0.1095 | 0.7966 | {'rouge1': 0.9130600978677069, 'rouge2': 0.8504810140358678, 'rougeL': 0.9127735778942226, 'rougeLsum': 0.9125405596281091} |
140
+ | 0.1492 | 28.0702 | 8000 | 0.1596 | 0.1103 | 0.7955 | {'rouge1': 0.9120508555985649, 'rouge2': 0.8487431673284105, 'rougeL': 0.9118153720293241, 'rougeLsum': 0.9116675598551898} |
141
+ | 0.1433 | 28.4214 | 8100 | 0.1600 | 0.1097 | 0.7967 | {'rouge1': 0.912271157742592, 'rouge2': 0.8491645313481614, 'rougeL': 0.9119383068792002, 'rougeLsum': 0.9120338521552568} |
142
+ | 0.1439 | 28.7726 | 8200 | 0.1591 | 0.1080 | 0.7995 | {'rouge1': 0.9144489684474263, 'rouge2': 0.8511979198969957, 'rougeL': 0.9137012695418792, 'rougeLsum': 0.9139716479376014} |
143
+ | 0.1325 | 29.1229 | 8300 | 0.1598 | 0.1085 | 0.7984 | {'rouge1': 0.9139270928380065, 'rouge2': 0.8507649128122179, 'rougeL': 0.9135264918778438, 'rougeLsum': 0.9135318897680185} |
144
+ | 0.1416 | 29.4741 | 8400 | 0.1595 | 0.1077 | 0.7997 | {'rouge1': 0.9146277490504104, 'rouge2': 0.8513032425483812, 'rougeL': 0.9140852902087768, 'rougeLsum': 0.9140094297264637} |
145
+ | 0.1451 | 29.8253 | 8500 | 0.1592 | 0.1081 | 0.7992 | {'rouge1': 0.9142461267493629, 'rouge2': 0.8512223456977356, 'rougeL': 0.9140461108455781, 'rougeLsum': 0.9139759112519872} |
146
+
147
+
148
+ ### Framework versions
149
+
150
+ - Transformers 4.49.0
151
+ - Pytorch 2.6.0+cu124
152
+ - Datasets 3.2.0
153
+ - Tokenizers 0.21.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
adapter.fr.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7044beb0469c98c1c6dbe2203c63c7f8115e83d27594e9d66b14c96f710a9b63
3
+ size 8880524
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:02b42d33c1379d71f8d785de84e9e59064a41d798e51b5cebde7ddbf2dd4444c
3
  size 3858972908
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:61556ab8b544ea1bfca5fafb50a4ab8ed1477665798920d251764e55bee4905d
3
  size 3858972908