End of training
Browse files- README.md +149 -195
- adapter.fr.safetensors +3 -0
- model.safetensors +1 -1
README.md
CHANGED
@@ -1,199 +1,153 @@
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
---
|
5 |
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
|
52 |
-
|
53 |
-
|
54 |
-
|
55 |
-
|
56 |
-
|
57 |
-
|
58 |
-
|
59 |
-
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
65 |
-
|
66 |
-
|
67 |
-
|
68 |
-
|
69 |
-
|
70 |
-
|
71 |
-
|
72 |
-
|
73 |
-
|
74 |
-
|
75 |
-
|
76 |
-
|
77 |
-
|
78 |
-
|
79 |
-
|
80 |
-
|
81 |
-
|
82 |
-
|
83 |
-
|
84 |
-
|
85 |
-
|
86 |
-
|
87 |
-
|
88 |
-
|
89 |
-
|
90 |
-
|
91 |
-
|
92 |
-
|
93 |
-
|
94 |
-
|
95 |
-
|
96 |
-
|
97 |
-
|
98 |
-
|
99 |
-
|
100 |
-
|
101 |
-
|
102 |
-
|
103 |
-
|
104 |
-
|
105 |
-
|
106 |
-
|
107 |
-
|
108 |
-
|
109 |
-
|
110 |
-
|
111 |
-
|
112 |
-
|
113 |
-
|
114 |
-
|
115 |
-
|
116 |
-
|
117 |
-
|
118 |
-
|
119 |
-
|
120 |
-
|
121 |
-
|
122 |
-
|
123 |
-
|
124 |
-
|
125 |
-
|
126 |
-
|
127 |
-
|
128 |
-
|
129 |
-
|
130 |
-
|
131 |
-
|
132 |
-
|
133 |
-
|
134 |
-
|
135 |
-
|
136 |
-
|
137 |
-
|
138 |
-
|
139 |
-
|
140 |
-
|
141 |
-
|
142 |
-
|
143 |
-
|
144 |
-
|
145 |
-
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
|
146 |
-
|
147 |
-
- **Hardware Type:** [More Information Needed]
|
148 |
-
- **Hours used:** [More Information Needed]
|
149 |
-
- **Cloud Provider:** [More Information Needed]
|
150 |
-
- **Compute Region:** [More Information Needed]
|
151 |
-
- **Carbon Emitted:** [More Information Needed]
|
152 |
-
|
153 |
-
## Technical Specifications [optional]
|
154 |
-
|
155 |
-
### Model Architecture and Objective
|
156 |
-
|
157 |
-
[More Information Needed]
|
158 |
-
|
159 |
-
### Compute Infrastructure
|
160 |
-
|
161 |
-
[More Information Needed]
|
162 |
-
|
163 |
-
#### Hardware
|
164 |
-
|
165 |
-
[More Information Needed]
|
166 |
-
|
167 |
-
#### Software
|
168 |
-
|
169 |
-
[More Information Needed]
|
170 |
-
|
171 |
-
## Citation [optional]
|
172 |
-
|
173 |
-
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
174 |
-
|
175 |
-
**BibTeX:**
|
176 |
-
|
177 |
-
[More Information Needed]
|
178 |
-
|
179 |
-
**APA:**
|
180 |
-
|
181 |
-
[More Information Needed]
|
182 |
-
|
183 |
-
## Glossary [optional]
|
184 |
-
|
185 |
-
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
|
186 |
-
|
187 |
-
[More Information Needed]
|
188 |
-
|
189 |
-
## More Information [optional]
|
190 |
-
|
191 |
-
[More Information Needed]
|
192 |
-
|
193 |
-
## Model Card Authors [optional]
|
194 |
-
|
195 |
-
[More Information Needed]
|
196 |
-
|
197 |
-
## Model Card Contact
|
198 |
-
|
199 |
-
[More Information Needed]
|
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
+
license: cc-by-nc-4.0
|
4 |
+
base_model: facebook/mms-1b-all
|
5 |
+
tags:
|
6 |
+
- generated_from_trainer
|
7 |
+
metrics:
|
8 |
+
- wer
|
9 |
+
- bleu
|
10 |
+
- rouge
|
11 |
+
model-index:
|
12 |
+
- name: frdirect
|
13 |
+
results: []
|
14 |
---
|
15 |
|
16 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
17 |
+
should probably proofread and complete it, then remove this comment. -->
|
18 |
+
|
19 |
+
# frdirect
|
20 |
+
|
21 |
+
This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on an unknown dataset.
|
22 |
+
It achieves the following results on the evaluation set:
|
23 |
+
- Loss: 0.1592
|
24 |
+
- Wer: 0.1081
|
25 |
+
- Bleu: 0.7992
|
26 |
+
- Rouge: {'rouge1': 0.9142461267493629, 'rouge2': 0.8512223456977356, 'rougeL': 0.9140461108455781, 'rougeLsum': 0.9139759112519872}
|
27 |
+
|
28 |
+
## Model description
|
29 |
+
|
30 |
+
More information needed
|
31 |
+
|
32 |
+
## Intended uses & limitations
|
33 |
+
|
34 |
+
More information needed
|
35 |
+
|
36 |
+
## Training and evaluation data
|
37 |
+
|
38 |
+
More information needed
|
39 |
+
|
40 |
+
## Training procedure
|
41 |
+
|
42 |
+
### Training hyperparameters
|
43 |
+
|
44 |
+
The following hyperparameters were used during training:
|
45 |
+
- learning_rate: 0.001
|
46 |
+
- train_batch_size: 8
|
47 |
+
- eval_batch_size: 8
|
48 |
+
- seed: 42
|
49 |
+
- gradient_accumulation_steps: 4
|
50 |
+
- total_train_batch_size: 32
|
51 |
+
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
52 |
+
- lr_scheduler_type: linear
|
53 |
+
- lr_scheduler_warmup_steps: 100
|
54 |
+
- num_epochs: 30
|
55 |
+
- mixed_precision_training: Native AMP
|
56 |
+
|
57 |
+
### Training results
|
58 |
+
|
59 |
+
| Training Loss | Epoch | Step | Validation Loss | Wer | Bleu | Rouge |
|
60 |
+
|:-------------:|:-------:|:----:|:---------------:|:------:|:------:|:---------------------------------------------------------------------------------------------------------------------------:|
|
61 |
+
| 6.0797 | 0.3512 | 100 | 0.3918 | 0.2286 | 0.6250 | {'rouge1': 0.7964874009051572, 'rouge2': 0.6786673339200013, 'rougeL': 0.795684539225159, 'rougeLsum': 0.7956009761818327} |
|
62 |
+
| 0.4361 | 0.7024 | 200 | 0.3395 | 0.2025 | 0.6572 | {'rouge1': 0.8286329956380034, 'rouge2': 0.7214521206108135, 'rougeL': 0.8280112126425977, 'rougeLsum': 0.8280818081170968} |
|
63 |
+
| 0.4069 | 1.0527 | 300 | 0.2683 | 0.1944 | 0.6691 | {'rouge1': 0.8333662643750593, 'rouge2': 0.7283520260829215, 'rougeL': 0.8329211797586267, 'rougeLsum': 0.8326890662231179} |
|
64 |
+
| 0.3738 | 1.4039 | 400 | 0.2500 | 0.1832 | 0.6885 | {'rouge1': 0.8475464903860954, 'rouge2': 0.7490364749206575, 'rougeL': 0.8470167743507384, 'rougeLsum': 0.8469648141276026} |
|
65 |
+
| 0.3393 | 1.7550 | 500 | 0.2473 | 0.1806 | 0.6872 | {'rouge1': 0.8487702037730754, 'rouge2': 0.7515019894679165, 'rougeL': 0.8482651202728038, 'rougeLsum': 0.8479933284128696} |
|
66 |
+
| 0.3337 | 2.1054 | 600 | 0.2341 | 0.1745 | 0.7015 | {'rouge1': 0.856524582926171, 'rouge2': 0.7615922551676065, 'rougeL': 0.8562090285236332, 'rougeLsum': 0.8560778549085171} |
|
67 |
+
| 0.3273 | 2.4565 | 700 | 0.2287 | 0.1776 | 0.6934 | {'rouge1': 0.8518960784546261, 'rouge2': 0.7571574767836958, 'rougeL': 0.8512601033375955, 'rougeLsum': 0.8510232422493931} |
|
68 |
+
| 0.3212 | 2.8077 | 800 | 0.2195 | 0.1673 | 0.7067 | {'rouge1': 0.8597006641882449, 'rouge2': 0.7675023646180051, 'rougeL': 0.8592257714231155, 'rougeLsum': 0.8593257522509605} |
|
69 |
+
| 0.2989 | 3.1580 | 900 | 0.2214 | 0.1633 | 0.7125 | {'rouge1': 0.8652652267566208, 'rouge2': 0.7751359730996655, 'rougeL': 0.8651950850137953, 'rougeLsum': 0.8648147312388779} |
|
70 |
+
| 0.2849 | 3.5092 | 1000 | 0.2176 | 0.1610 | 0.7119 | {'rouge1': 0.8674026972599356, 'rouge2': 0.7767934576217497, 'rougeL': 0.8670611483514326, 'rougeLsum': 0.866923394119187} |
|
71 |
+
| 0.321 | 3.8604 | 1100 | 0.2140 | 0.1562 | 0.7203 | {'rouge1': 0.8687120531054047, 'rouge2': 0.7801284964910911, 'rougeL': 0.8685687320059696, 'rougeLsum': 0.8686957534611408} |
|
72 |
+
| 0.2901 | 4.2107 | 1200 | 0.2092 | 0.1570 | 0.7233 | {'rouge1': 0.870783939334159, 'rouge2': 0.7846870067295553, 'rougeL': 0.8700614974709336, 'rougeLsum': 0.8703440992984535} |
|
73 |
+
| 0.2758 | 4.5619 | 1300 | 0.2208 | 0.1678 | 0.7044 | {'rouge1': 0.8627882593824365, 'rouge2': 0.7732336382659819, 'rougeL': 0.8623362277788245, 'rougeLsum': 0.862456577073438} |
|
74 |
+
| 0.2802 | 4.9131 | 1400 | 0.2039 | 0.1547 | 0.7258 | {'rouge1': 0.8731727674477189, 'rouge2': 0.7886178130374446, 'rougeL': 0.872672601389526, 'rougeLsum': 0.8726659641042169} |
|
75 |
+
| 0.2638 | 5.2634 | 1500 | 0.2043 | 0.1510 | 0.7335 | {'rouge1': 0.8755955637027819, 'rouge2': 0.7930493884188712, 'rougeL': 0.8751631654870777, 'rougeLsum': 0.8751907751029582} |
|
76 |
+
| 0.2752 | 5.6146 | 1600 | 0.2055 | 0.1551 | 0.7270 | {'rouge1': 0.872388381525685, 'rouge2': 0.7875275384987104, 'rougeL': 0.8719998038854011, 'rougeLsum': 0.8716380106368946} |
|
77 |
+
| 0.2611 | 5.9658 | 1700 | 0.2000 | 0.1470 | 0.7371 | {'rouge1': 0.8788848516546419, 'rouge2': 0.7961419908259184, 'rougeL': 0.8787077158049774, 'rougeLsum': 0.8785400491349351} |
|
78 |
+
| 0.2473 | 6.3161 | 1800 | 0.1964 | 0.1480 | 0.7367 | {'rouge1': 0.8780453998988988, 'rouge2': 0.7968768691849546, 'rougeL': 0.877539022180082, 'rougeLsum': 0.8772607424486614} |
|
79 |
+
| 0.2595 | 6.6673 | 1900 | 0.2025 | 0.1480 | 0.7381 | {'rouge1': 0.879639846099505, 'rouge2': 0.797600429803611, 'rougeL': 0.8793686789606971, 'rougeLsum': 0.8790549352654082} |
|
80 |
+
| 0.2689 | 7.0176 | 2000 | 0.1969 | 0.1432 | 0.7430 | {'rouge1': 0.881797326390697, 'rouge2': 0.8004647695765528, 'rougeL': 0.8813203554835087, 'rougeLsum': 0.8811828285656307} |
|
81 |
+
| 0.246 | 7.3687 | 2100 | 0.1963 | 0.1449 | 0.7398 | {'rouge1': 0.8817110807418125, 'rouge2': 0.8017781199834159, 'rougeL': 0.8815737656302565, 'rougeLsum': 0.8813654064210932} |
|
82 |
+
| 0.2502 | 7.7199 | 2200 | 0.1925 | 0.1492 | 0.7347 | {'rouge1': 0.8793953293462229, 'rouge2': 0.7995783364307951, 'rougeL': 0.8789946756811035, 'rougeLsum': 0.8792323972067102} |
|
83 |
+
| 0.2355 | 8.0702 | 2300 | 0.1912 | 0.1402 | 0.7460 | {'rouge1': 0.8848122766361803, 'rouge2': 0.805493594353921, 'rougeL': 0.8843513031942714, 'rougeLsum': 0.8847078352624693} |
|
84 |
+
| 0.2366 | 8.4214 | 2400 | 0.1885 | 0.1412 | 0.7426 | {'rouge1': 0.8840181215146308, 'rouge2': 0.8027935769840546, 'rougeL': 0.8836147367072817, 'rougeLsum': 0.8836426860532026} |
|
85 |
+
| 0.2407 | 8.7726 | 2500 | 0.1918 | 0.1447 | 0.7397 | {'rouge1': 0.8821736207296971, 'rouge2': 0.801567427041519, 'rougeL': 0.881521485738069, 'rougeLsum': 0.88187487496196} |
|
86 |
+
| 0.2387 | 9.1229 | 2600 | 0.1903 | 0.1334 | 0.7605 | {'rouge1': 0.8897429883784935, 'rouge2': 0.8137161586716353, 'rougeL': 0.8892440561394621, 'rougeLsum': 0.8891549602299966} |
|
87 |
+
| 0.2347 | 9.4741 | 2700 | 0.1834 | 0.1345 | 0.7572 | {'rouge1': 0.8881942208247308, 'rouge2': 0.8119581058011768, 'rougeL': 0.8879981514910154, 'rougeLsum': 0.887901612349636} |
|
88 |
+
| 0.2271 | 9.8253 | 2800 | 0.1858 | 0.1335 | 0.7590 | {'rouge1': 0.8911506791113586, 'rouge2': 0.8148539957097256, 'rougeL': 0.8905751461168572, 'rougeLsum': 0.890902060551137} |
|
89 |
+
| 0.2303 | 10.1756 | 2900 | 0.1858 | 0.1365 | 0.7501 | {'rouge1': 0.887481336474667, 'rouge2': 0.8085757416110948, 'rougeL': 0.887243114877972, 'rougeLsum': 0.8871195261439274} |
|
90 |
+
| 0.2284 | 10.5268 | 3000 | 0.1873 | 0.1348 | 0.7543 | {'rouge1': 0.8896704902968307, 'rouge2': 0.8119666309133653, 'rougeL': 0.8895050087347427, 'rougeLsum': 0.8893208588598065} |
|
91 |
+
| 0.2177 | 10.8780 | 3100 | 0.1899 | 0.1412 | 0.7429 | {'rouge1': 0.883841093950449, 'rouge2': 0.8027541651011862, 'rougeL': 0.8832016518265267, 'rougeLsum': 0.8831940672343328} |
|
92 |
+
| 0.2259 | 11.2283 | 3200 | 0.1841 | 0.1382 | 0.7505 | {'rouge1': 0.8861091671648993, 'rouge2': 0.8076445498910136, 'rougeL': 0.8860146387775966, 'rougeLsum': 0.8856781536323487} |
|
93 |
+
| 0.2183 | 11.5795 | 3300 | 0.1803 | 0.1333 | 0.7589 | {'rouge1': 0.8917208112217359, 'rouge2': 0.8180400270633807, 'rougeL': 0.8911859595950976, 'rougeLsum': 0.8913321406932759} |
|
94 |
+
| 0.2124 | 11.9306 | 3400 | 0.1826 | 0.1309 | 0.7626 | {'rouge1': 0.8924689635777846, 'rouge2': 0.8186471633950445, 'rougeL': 0.891927412819415, 'rougeLsum': 0.8920422701086449} |
|
95 |
+
| 0.1961 | 12.2809 | 3500 | 0.1824 | 0.1300 | 0.7648 | {'rouge1': 0.8947275911029863, 'rouge2': 0.8218776029324886, 'rougeL': 0.8942104883105186, 'rougeLsum': 0.894350829557474} |
|
96 |
+
| 0.2121 | 12.6321 | 3600 | 0.1792 | 0.1278 | 0.7649 | {'rouge1': 0.8965227623557459, 'rouge2': 0.8223749938722336, 'rougeL': 0.8961352103229281, 'rougeLsum': 0.8959098417776623} |
|
97 |
+
| 0.2087 | 12.9833 | 3700 | 0.1767 | 0.1294 | 0.7648 | {'rouge1': 0.8960785872160766, 'rouge2': 0.8226243103661531, 'rougeL': 0.8959515880736657, 'rougeLsum': 0.8959147567781259} |
|
98 |
+
| 0.1943 | 13.3336 | 3800 | 0.1801 | 0.1288 | 0.7644 | {'rouge1': 0.8947704603138941, 'rouge2': 0.8196022891096679, 'rougeL': 0.8945419678439788, 'rougeLsum': 0.8941955883885566} |
|
99 |
+
| 0.2053 | 13.6848 | 3900 | 0.1732 | 0.1269 | 0.7682 | {'rouge1': 0.8953389907409508, 'rouge2': 0.8204708771662934, 'rougeL': 0.8942418907803242, 'rougeLsum': 0.8943319307650137} |
|
100 |
+
| 0.2196 | 14.0351 | 4000 | 0.1722 | 0.1258 | 0.7704 | {'rouge1': 0.8971096393899529, 'rouge2': 0.8245173641233066, 'rougeL': 0.8968754492659268, 'rougeLsum': 0.8967885844102701} |
|
101 |
+
| 0.1996 | 14.3863 | 4100 | 0.1746 | 0.1283 | 0.7663 | {'rouge1': 0.8992149614052899, 'rouge2': 0.8283870278179525, 'rougeL': 0.8990034855026199, 'rougeLsum': 0.899113705236827} |
|
102 |
+
| 0.2028 | 14.7375 | 4200 | 0.1723 | 0.1258 | 0.7688 | {'rouge1': 0.8981221304907357, 'rouge2': 0.8268085111954614, 'rougeL': 0.8978730997154603, 'rougeLsum': 0.8978255561942574} |
|
103 |
+
| 0.1784 | 15.0878 | 4300 | 0.1741 | 0.1210 | 0.7777 | {'rouge1': 0.9022989643269592, 'rouge2': 0.8321097618769113, 'rougeL': 0.9019679668540621, 'rougeLsum': 0.9020840316410879} |
|
104 |
+
| 0.1954 | 15.4390 | 4400 | 0.1746 | 0.1234 | 0.7748 | {'rouge1': 0.8990459493090655, 'rouge2': 0.8287806337458845, 'rougeL': 0.8990074200510402, 'rougeLsum': 0.8988288738757491} |
|
105 |
+
| 0.1916 | 15.7902 | 4500 | 0.1719 | 0.1230 | 0.7761 | {'rouge1': 0.900488872562492, 'rouge2': 0.8307065830708865, 'rougeL': 0.9000372599293843, 'rougeLsum': 0.9002263530831652} |
|
106 |
+
| 0.1883 | 16.1405 | 4600 | 0.1712 | 0.1226 | 0.7757 | {'rouge1': 0.9019026814628661, 'rouge2': 0.8329976152495208, 'rougeL': 0.9016592566182122, 'rougeLsum': 0.9016282070055117} |
|
107 |
+
| 0.1832 | 16.4917 | 4700 | 0.1713 | 0.1248 | 0.7733 | {'rouge1': 0.8995223210908226, 'rouge2': 0.8290222714943427, 'rougeL': 0.8994032458040973, 'rougeLsum': 0.899366836054343} |
|
108 |
+
| 0.1888 | 16.8428 | 4800 | 0.1698 | 0.1264 | 0.7721 | {'rouge1': 0.8982535964067325, 'rouge2': 0.8288396477969829, 'rougeL': 0.8977273104751539, 'rougeLsum': 0.8976965343038692} |
|
109 |
+
| 0.1857 | 17.1932 | 4900 | 0.1718 | 0.1230 | 0.7757 | {'rouge1': 0.9026836932266615, 'rouge2': 0.833487216136216, 'rougeL': 0.9022265966641445, 'rougeLsum': 0.9024265508303717} |
|
110 |
+
| 0.1858 | 17.5443 | 5000 | 0.1705 | 0.1204 | 0.7792 | {'rouge1': 0.9046938392928605, 'rouge2': 0.8378406365404705, 'rougeL': 0.904281513678646, 'rougeLsum': 0.9040395290033556} |
|
111 |
+
| 0.1838 | 17.8955 | 5100 | 0.1713 | 0.1222 | 0.7773 | {'rouge1': 0.9025388171823945, 'rouge2': 0.8339417592886358, 'rougeL': 0.901982659834949, 'rougeLsum': 0.902191073170105} |
|
112 |
+
| 0.1784 | 18.2458 | 5200 | 0.1710 | 0.1228 | 0.7741 | {'rouge1': 0.9027135961066619, 'rouge2': 0.8338803735375095, 'rougeL': 0.9021313465595333, 'rougeLsum': 0.9022241778623963} |
|
113 |
+
| 0.1748 | 18.5970 | 5300 | 0.1700 | 0.1205 | 0.7803 | {'rouge1': 0.9028943462107296, 'rouge2': 0.8360887354751103, 'rougeL': 0.9028049282097476, 'rougeLsum': 0.9026754818313738} |
|
114 |
+
| 0.1785 | 18.9482 | 5400 | 0.1683 | 0.1191 | 0.7827 | {'rouge1': 0.9058615541527929, 'rouge2': 0.8397686128782502, 'rougeL': 0.905380338988193, 'rougeLsum': 0.9053818296865667} |
|
115 |
+
| 0.1715 | 19.2985 | 5500 | 0.1693 | 0.1197 | 0.7813 | {'rouge1': 0.9042951746231659, 'rouge2': 0.836346216169132, 'rougeL': 0.9038200107993153, 'rougeLsum': 0.9038172163450839} |
|
116 |
+
| 0.1743 | 19.6497 | 5600 | 0.1656 | 0.1198 | 0.7820 | {'rouge1': 0.9056265375031829, 'rouge2': 0.8389744056310899, 'rougeL': 0.9052424864181929, 'rougeLsum': 0.9052107898473516} |
|
117 |
+
| 0.179 | 20.0 | 5700 | 0.1662 | 0.1200 | 0.7813 | {'rouge1': 0.9049957724375504, 'rouge2': 0.838664745594604, 'rougeL': 0.9047378933290081, 'rougeLsum': 0.9047796552290915} |
|
118 |
+
| 0.1705 | 20.3512 | 5800 | 0.1671 | 0.1158 | 0.7875 | {'rouge1': 0.9066988025993499, 'rouge2': 0.84144312190399, 'rougeL': 0.9066516406825784, 'rougeLsum': 0.9064192832523982} |
|
119 |
+
| 0.1737 | 20.7024 | 5900 | 0.1668 | 0.1191 | 0.7809 | {'rouge1': 0.9044882408830994, 'rouge2': 0.8379831752266652, 'rougeL': 0.9044191300138034, 'rougeLsum': 0.9041602420355832} |
|
120 |
+
| 0.161 | 21.0527 | 6000 | 0.1675 | 0.1176 | 0.7855 | {'rouge1': 0.9057981628991186, 'rouge2': 0.8402890351927816, 'rougeL': 0.9055204210299842, 'rougeLsum': 0.9055949937515819} |
|
121 |
+
| 0.1634 | 21.4039 | 6100 | 0.1656 | 0.1172 | 0.7849 | {'rouge1': 0.9052591361441602, 'rouge2': 0.8390947395192179, 'rougeL': 0.904901443457665, 'rougeLsum': 0.9049274938729717} |
|
122 |
+
| 0.1717 | 21.7550 | 6200 | 0.1655 | 0.1184 | 0.7850 | {'rouge1': 0.9056855411229652, 'rouge2': 0.8391993079429194, 'rougeL': 0.9051256484322957, 'rougeLsum': 0.9054924942388913} |
|
123 |
+
| 0.1532 | 22.1054 | 6300 | 0.1640 | 0.1138 | 0.7895 | {'rouge1': 0.908514689307546, 'rouge2': 0.8432231540949471, 'rougeL': 0.9081112639646672, 'rougeLsum': 0.908272589564951} |
|
124 |
+
| 0.167 | 22.4565 | 6400 | 0.1626 | 0.1140 | 0.7921 | {'rouge1': 0.9103164260947302, 'rouge2': 0.847605808196412, 'rougeL': 0.9098933635969182, 'rougeLsum': 0.9098438675661122} |
|
125 |
+
| 0.1606 | 22.8077 | 6500 | 0.1632 | 0.1115 | 0.7956 | {'rouge1': 0.9112987341936316, 'rouge2': 0.8492302659397843, 'rougeL': 0.9111470792331513, 'rougeLsum': 0.9109445157457727} |
|
126 |
+
| 0.1599 | 23.1580 | 6600 | 0.1642 | 0.1108 | 0.7960 | {'rouge1': 0.9118713013856634, 'rouge2': 0.848580538085876, 'rougeL': 0.9114618012803377, 'rougeLsum': 0.9116046463908325} |
|
127 |
+
| 0.156 | 23.5092 | 6700 | 0.1634 | 0.1122 | 0.7950 | {'rouge1': 0.9106394674615637, 'rouge2': 0.8474609350248357, 'rougeL': 0.910065118851396, 'rougeLsum': 0.9100662823980902} |
|
128 |
+
| 0.1589 | 23.8604 | 6800 | 0.1625 | 0.1130 | 0.7916 | {'rouge1': 0.9103350470563911, 'rouge2': 0.8474738385027315, 'rougeL': 0.9099995608863516, 'rougeLsum': 0.9100397914660747} |
|
129 |
+
| 0.1622 | 24.2107 | 6900 | 0.1626 | 0.1117 | 0.7943 | {'rouge1': 0.9133839350790938, 'rouge2': 0.8511081438025545, 'rougeL': 0.9126575984413424, 'rougeLsum': 0.9126505650621592} |
|
130 |
+
| 0.1521 | 24.5619 | 7000 | 0.1618 | 0.1109 | 0.7963 | {'rouge1': 0.912211680613469, 'rouge2': 0.8496181639239891, 'rougeL': 0.9117487343663472, 'rougeLsum': 0.9117499155750092} |
|
131 |
+
| 0.1503 | 24.9131 | 7100 | 0.1612 | 0.1115 | 0.7945 | {'rouge1': 0.9119319650927245, 'rouge2': 0.8489304675858942, 'rougeL': 0.9115897952726388, 'rougeLsum': 0.9116405544813904} |
|
132 |
+
| 0.1504 | 25.2634 | 7200 | 0.1621 | 0.1103 | 0.7957 | {'rouge1': 0.9121880227389143, 'rouge2': 0.8492463401850738, 'rougeL': 0.9118447435959087, 'rougeLsum': 0.9118033384400608} |
|
133 |
+
| 0.1519 | 25.6146 | 7300 | 0.1615 | 0.1118 | 0.7931 | {'rouge1': 0.9112498683244998, 'rouge2': 0.8480832804563686, 'rougeL': 0.9107105272229861, 'rougeLsum': 0.9107937803611704} |
|
134 |
+
| 0.1479 | 25.9658 | 7400 | 0.1611 | 0.1098 | 0.7974 | {'rouge1': 0.9136242054251107, 'rouge2': 0.8509392563862166, 'rougeL': 0.9130826085424906, 'rougeLsum': 0.9132200366521013} |
|
135 |
+
| 0.1437 | 26.3161 | 7500 | 0.1609 | 0.1099 | 0.7958 | {'rouge1': 0.9130313007924953, 'rouge2': 0.8502201629741937, 'rougeL': 0.9126113638303235, 'rougeLsum': 0.9126553187114919} |
|
136 |
+
| 0.148 | 26.6673 | 7600 | 0.1609 | 0.1095 | 0.7968 | {'rouge1': 0.9137297115874194, 'rouge2': 0.8504308144873894, 'rougeL': 0.9133787545300738, 'rougeLsum': 0.9133679379590172} |
|
137 |
+
| 0.1568 | 27.0176 | 7700 | 0.1607 | 0.1107 | 0.7945 | {'rouge1': 0.9131462089688931, 'rouge2': 0.8499313234434469, 'rougeL': 0.912728356326638, 'rougeLsum': 0.9129359515643884} |
|
138 |
+
| 0.1415 | 27.3687 | 7800 | 0.1607 | 0.1089 | 0.7971 | {'rouge1': 0.9137517387960785, 'rouge2': 0.8501495668327003, 'rougeL': 0.913338099417192, 'rougeLsum': 0.9136064287202272} |
|
139 |
+
| 0.1496 | 27.7199 | 7900 | 0.1597 | 0.1095 | 0.7966 | {'rouge1': 0.9130600978677069, 'rouge2': 0.8504810140358678, 'rougeL': 0.9127735778942226, 'rougeLsum': 0.9125405596281091} |
|
140 |
+
| 0.1492 | 28.0702 | 8000 | 0.1596 | 0.1103 | 0.7955 | {'rouge1': 0.9120508555985649, 'rouge2': 0.8487431673284105, 'rougeL': 0.9118153720293241, 'rougeLsum': 0.9116675598551898} |
|
141 |
+
| 0.1433 | 28.4214 | 8100 | 0.1600 | 0.1097 | 0.7967 | {'rouge1': 0.912271157742592, 'rouge2': 0.8491645313481614, 'rougeL': 0.9119383068792002, 'rougeLsum': 0.9120338521552568} |
|
142 |
+
| 0.1439 | 28.7726 | 8200 | 0.1591 | 0.1080 | 0.7995 | {'rouge1': 0.9144489684474263, 'rouge2': 0.8511979198969957, 'rougeL': 0.9137012695418792, 'rougeLsum': 0.9139716479376014} |
|
143 |
+
| 0.1325 | 29.1229 | 8300 | 0.1598 | 0.1085 | 0.7984 | {'rouge1': 0.9139270928380065, 'rouge2': 0.8507649128122179, 'rougeL': 0.9135264918778438, 'rougeLsum': 0.9135318897680185} |
|
144 |
+
| 0.1416 | 29.4741 | 8400 | 0.1595 | 0.1077 | 0.7997 | {'rouge1': 0.9146277490504104, 'rouge2': 0.8513032425483812, 'rougeL': 0.9140852902087768, 'rougeLsum': 0.9140094297264637} |
|
145 |
+
| 0.1451 | 29.8253 | 8500 | 0.1592 | 0.1081 | 0.7992 | {'rouge1': 0.9142461267493629, 'rouge2': 0.8512223456977356, 'rougeL': 0.9140461108455781, 'rougeLsum': 0.9139759112519872} |
|
146 |
+
|
147 |
+
|
148 |
+
### Framework versions
|
149 |
+
|
150 |
+
- Transformers 4.49.0
|
151 |
+
- Pytorch 2.6.0+cu124
|
152 |
+
- Datasets 3.2.0
|
153 |
+
- Tokenizers 0.21.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
adapter.fr.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7044beb0469c98c1c6dbe2203c63c7f8115e83d27594e9d66b14c96f710a9b63
|
3 |
+
size 8880524
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 3858972908
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:61556ab8b544ea1bfca5fafb50a4ab8ed1477665798920d251764e55bee4905d
|
3 |
size 3858972908
|