End of training

Browse files

Files changed (3) hide show

README.md +149 -195
adapter.fr.safetensors +3 -0
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -1,199 +1,153 @@
 ---
 library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+license: cc-by-nc-4.0
+base_model: facebook/mms-1b-all
+tags:
+- generated_from_trainer
+metrics:
+- wer
+- bleu
+- rouge
+model-index:
+- name: frdirect
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# frdirect
+This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.1592
+- Wer: 0.1081
+- Bleu: 0.7992
+- Rouge: {'rouge1': 0.9142461267493629, 'rouge2': 0.8512223456977356, 'rougeL': 0.9140461108455781, 'rougeLsum': 0.9139759112519872}
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.001
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 32
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 100
+- num_epochs: 30
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch   | Step | Validation Loss | Wer    | Bleu   | Rouge                                                                                                                       |
+|:-------------:|:-------:|:----:|:---------------:|:------:|:------:|:---------------------------------------------------------------------------------------------------------------------------:|
+| 6.0797        | 0.3512  | 100  | 0.3918          | 0.2286 | 0.6250 | {'rouge1': 0.7964874009051572, 'rouge2': 0.6786673339200013, 'rougeL': 0.795684539225159, 'rougeLsum': 0.7956009761818327}  |
+| 0.4361        | 0.7024  | 200  | 0.3395          | 0.2025 | 0.6572 | {'rouge1': 0.8286329956380034, 'rouge2': 0.7214521206108135, 'rougeL': 0.8280112126425977, 'rougeLsum': 0.8280818081170968} |
+| 0.4069        | 1.0527  | 300  | 0.2683          | 0.1944 | 0.6691 | {'rouge1': 0.8333662643750593, 'rouge2': 0.7283520260829215, 'rougeL': 0.8329211797586267, 'rougeLsum': 0.8326890662231179} |
+| 0.3738        | 1.4039  | 400  | 0.2500          | 0.1832 | 0.6885 | {'rouge1': 0.8475464903860954, 'rouge2': 0.7490364749206575, 'rougeL': 0.8470167743507384, 'rougeLsum': 0.8469648141276026} |
+| 0.3393        | 1.7550  | 500  | 0.2473          | 0.1806 | 0.6872 | {'rouge1': 0.8487702037730754, 'rouge2': 0.7515019894679165, 'rougeL': 0.8482651202728038, 'rougeLsum': 0.8479933284128696} |
+| 0.3337        | 2.1054  | 600  | 0.2341          | 0.1745 | 0.7015 | {'rouge1': 0.856524582926171, 'rouge2': 0.7615922551676065, 'rougeL': 0.8562090285236332, 'rougeLsum': 0.8560778549085171}  |
+| 0.3273        | 2.4565  | 700  | 0.2287          | 0.1776 | 0.6934 | {'rouge1': 0.8518960784546261, 'rouge2': 0.7571574767836958, 'rougeL': 0.8512601033375955, 'rougeLsum': 0.8510232422493931} |
+| 0.3212        | 2.8077  | 800  | 0.2195          | 0.1673 | 0.7067 | {'rouge1': 0.8597006641882449, 'rouge2': 0.7675023646180051, 'rougeL': 0.8592257714231155, 'rougeLsum': 0.8593257522509605} |
+| 0.2989        | 3.1580  | 900  | 0.2214          | 0.1633 | 0.7125 | {'rouge1': 0.8652652267566208, 'rouge2': 0.7751359730996655, 'rougeL': 0.8651950850137953, 'rougeLsum': 0.8648147312388779} |
+| 0.2849        | 3.5092  | 1000 | 0.2176          | 0.1610 | 0.7119 | {'rouge1': 0.8674026972599356, 'rouge2': 0.7767934576217497, 'rougeL': 0.8670611483514326, 'rougeLsum': 0.866923394119187}  |
+| 0.321         | 3.8604  | 1100 | 0.2140          | 0.1562 | 0.7203 | {'rouge1': 0.8687120531054047, 'rouge2': 0.7801284964910911, 'rougeL': 0.8685687320059696, 'rougeLsum': 0.8686957534611408} |
+| 0.2901        | 4.2107  | 1200 | 0.2092          | 0.1570 | 0.7233 | {'rouge1': 0.870783939334159, 'rouge2': 0.7846870067295553, 'rougeL': 0.8700614974709336, 'rougeLsum': 0.8703440992984535}  |
+| 0.2758        | 4.5619  | 1300 | 0.2208          | 0.1678 | 0.7044 | {'rouge1': 0.8627882593824365, 'rouge2': 0.7732336382659819, 'rougeL': 0.8623362277788245, 'rougeLsum': 0.862456577073438}  |
+| 0.2802        | 4.9131  | 1400 | 0.2039          | 0.1547 | 0.7258 | {'rouge1': 0.8731727674477189, 'rouge2': 0.7886178130374446, 'rougeL': 0.872672601389526, 'rougeLsum': 0.8726659641042169}  |
+| 0.2638        | 5.2634  | 1500 | 0.2043          | 0.1510 | 0.7335 | {'rouge1': 0.8755955637027819, 'rouge2': 0.7930493884188712, 'rougeL': 0.8751631654870777, 'rougeLsum': 0.8751907751029582} |
+| 0.2752        | 5.6146  | 1600 | 0.2055          | 0.1551 | 0.7270 | {'rouge1': 0.872388381525685, 'rouge2': 0.7875275384987104, 'rougeL': 0.8719998038854011, 'rougeLsum': 0.8716380106368946}  |
+| 0.2611        | 5.9658  | 1700 | 0.2000          | 0.1470 | 0.7371 | {'rouge1': 0.8788848516546419, 'rouge2': 0.7961419908259184, 'rougeL': 0.8787077158049774, 'rougeLsum': 0.8785400491349351} |
+| 0.2473        | 6.3161  | 1800 | 0.1964          | 0.1480 | 0.7367 | {'rouge1': 0.8780453998988988, 'rouge2': 0.7968768691849546, 'rougeL': 0.877539022180082, 'rougeLsum': 0.8772607424486614}  |
+| 0.2595        | 6.6673  | 1900 | 0.2025          | 0.1480 | 0.7381 | {'rouge1': 0.879639846099505, 'rouge2': 0.797600429803611, 'rougeL': 0.8793686789606971, 'rougeLsum': 0.8790549352654082}   |
+| 0.2689        | 7.0176  | 2000 | 0.1969          | 0.1432 | 0.7430 | {'rouge1': 0.881797326390697, 'rouge2': 0.8004647695765528, 'rougeL': 0.8813203554835087, 'rougeLsum': 0.8811828285656307}  |
+| 0.246         | 7.3687  | 2100 | 0.1963          | 0.1449 | 0.7398 | {'rouge1': 0.8817110807418125, 'rouge2': 0.8017781199834159, 'rougeL': 0.8815737656302565, 'rougeLsum': 0.8813654064210932} |
+| 0.2502        | 7.7199  | 2200 | 0.1925          | 0.1492 | 0.7347 | {'rouge1': 0.8793953293462229, 'rouge2': 0.7995783364307951, 'rougeL': 0.8789946756811035, 'rougeLsum': 0.8792323972067102} |
+| 0.2355        | 8.0702  | 2300 | 0.1912          | 0.1402 | 0.7460 | {'rouge1': 0.8848122766361803, 'rouge2': 0.805493594353921, 'rougeL': 0.8843513031942714, 'rougeLsum': 0.8847078352624693}  |
+| 0.2366        | 8.4214  | 2400 | 0.1885          | 0.1412 | 0.7426 | {'rouge1': 0.8840181215146308, 'rouge2': 0.8027935769840546, 'rougeL': 0.8836147367072817, 'rougeLsum': 0.8836426860532026} |
+| 0.2407        | 8.7726  | 2500 | 0.1918          | 0.1447 | 0.7397 | {'rouge1': 0.8821736207296971, 'rouge2': 0.801567427041519, 'rougeL': 0.881521485738069, 'rougeLsum': 0.88187487496196}     |
+| 0.2387        | 9.1229  | 2600 | 0.1903          | 0.1334 | 0.7605 | {'rouge1': 0.8897429883784935, 'rouge2': 0.8137161586716353, 'rougeL': 0.8892440561394621, 'rougeLsum': 0.8891549602299966} |
+| 0.2347        | 9.4741  | 2700 | 0.1834          | 0.1345 | 0.7572 | {'rouge1': 0.8881942208247308, 'rouge2': 0.8119581058011768, 'rougeL': 0.8879981514910154, 'rougeLsum': 0.887901612349636}  |
+| 0.2271        | 9.8253  | 2800 | 0.1858          | 0.1335 | 0.7590 | {'rouge1': 0.8911506791113586, 'rouge2': 0.8148539957097256, 'rougeL': 0.8905751461168572, 'rougeLsum': 0.890902060551137}  |
+| 0.2303        | 10.1756 | 2900 | 0.1858          | 0.1365 | 0.7501 | {'rouge1': 0.887481336474667, 'rouge2': 0.8085757416110948, 'rougeL': 0.887243114877972, 'rougeLsum': 0.8871195261439274}   |
+| 0.2284        | 10.5268 | 3000 | 0.1873          | 0.1348 | 0.7543 | {'rouge1': 0.8896704902968307, 'rouge2': 0.8119666309133653, 'rougeL': 0.8895050087347427, 'rougeLsum': 0.8893208588598065} |
+| 0.2177        | 10.8780 | 3100 | 0.1899          | 0.1412 | 0.7429 | {'rouge1': 0.883841093950449, 'rouge2': 0.8027541651011862, 'rougeL': 0.8832016518265267, 'rougeLsum': 0.8831940672343328}  |
+| 0.2259        | 11.2283 | 3200 | 0.1841          | 0.1382 | 0.7505 | {'rouge1': 0.8861091671648993, 'rouge2': 0.8076445498910136, 'rougeL': 0.8860146387775966, 'rougeLsum': 0.8856781536323487} |
+| 0.2183        | 11.5795 | 3300 | 0.1803          | 0.1333 | 0.7589 | {'rouge1': 0.8917208112217359, 'rouge2': 0.8180400270633807, 'rougeL': 0.8911859595950976, 'rougeLsum': 0.8913321406932759} |
+| 0.2124        | 11.9306 | 3400 | 0.1826          | 0.1309 | 0.7626 | {'rouge1': 0.8924689635777846, 'rouge2': 0.8186471633950445, 'rougeL': 0.891927412819415, 'rougeLsum': 0.8920422701086449}  |
+| 0.1961        | 12.2809 | 3500 | 0.1824          | 0.1300 | 0.7648 | {'rouge1': 0.8947275911029863, 'rouge2': 0.8218776029324886, 'rougeL': 0.8942104883105186, 'rougeLsum': 0.894350829557474}  |
+| 0.2121        | 12.6321 | 3600 | 0.1792          | 0.1278 | 0.7649 | {'rouge1': 0.8965227623557459, 'rouge2': 0.8223749938722336, 'rougeL': 0.8961352103229281, 'rougeLsum': 0.8959098417776623} |
+| 0.2087        | 12.9833 | 3700 | 0.1767          | 0.1294 | 0.7648 | {'rouge1': 0.8960785872160766, 'rouge2': 0.8226243103661531, 'rougeL': 0.8959515880736657, 'rougeLsum': 0.8959147567781259} |
+| 0.1943        | 13.3336 | 3800 | 0.1801          | 0.1288 | 0.7644 | {'rouge1': 0.8947704603138941, 'rouge2': 0.8196022891096679, 'rougeL': 0.8945419678439788, 'rougeLsum': 0.8941955883885566} |
+| 0.2053        | 13.6848 | 3900 | 0.1732          | 0.1269 | 0.7682 | {'rouge1': 0.8953389907409508, 'rouge2': 0.8204708771662934, 'rougeL': 0.8942418907803242, 'rougeLsum': 0.8943319307650137} |
+| 0.2196        | 14.0351 | 4000 | 0.1722          | 0.1258 | 0.7704 | {'rouge1': 0.8971096393899529, 'rouge2': 0.8245173641233066, 'rougeL': 0.8968754492659268, 'rougeLsum': 0.8967885844102701} |
+| 0.1996        | 14.3863 | 4100 | 0.1746          | 0.1283 | 0.7663 | {'rouge1': 0.8992149614052899, 'rouge2': 0.8283870278179525, 'rougeL': 0.8990034855026199, 'rougeLsum': 0.899113705236827}  |
+| 0.2028        | 14.7375 | 4200 | 0.1723          | 0.1258 | 0.7688 | {'rouge1': 0.8981221304907357, 'rouge2': 0.8268085111954614, 'rougeL': 0.8978730997154603, 'rougeLsum': 0.8978255561942574} |
+| 0.1784        | 15.0878 | 4300 | 0.1741          | 0.1210 | 0.7777 | {'rouge1': 0.9022989643269592, 'rouge2': 0.8321097618769113, 'rougeL': 0.9019679668540621, 'rougeLsum': 0.9020840316410879} |
+| 0.1954        | 15.4390 | 4400 | 0.1746          | 0.1234 | 0.7748 | {'rouge1': 0.8990459493090655, 'rouge2': 0.8287806337458845, 'rougeL': 0.8990074200510402, 'rougeLsum': 0.8988288738757491} |
+| 0.1916        | 15.7902 | 4500 | 0.1719          | 0.1230 | 0.7761 | {'rouge1': 0.900488872562492, 'rouge2': 0.8307065830708865, 'rougeL': 0.9000372599293843, 'rougeLsum': 0.9002263530831652}  |
+| 0.1883        | 16.1405 | 4600 | 0.1712          | 0.1226 | 0.7757 | {'rouge1': 0.9019026814628661, 'rouge2': 0.8329976152495208, 'rougeL': 0.9016592566182122, 'rougeLsum': 0.9016282070055117} |
+| 0.1832        | 16.4917 | 4700 | 0.1713          | 0.1248 | 0.7733 | {'rouge1': 0.8995223210908226, 'rouge2': 0.8290222714943427, 'rougeL': 0.8994032458040973, 'rougeLsum': 0.899366836054343}  |
+| 0.1888        | 16.8428 | 4800 | 0.1698          | 0.1264 | 0.7721 | {'rouge1': 0.8982535964067325, 'rouge2': 0.8288396477969829, 'rougeL': 0.8977273104751539, 'rougeLsum': 0.8976965343038692} |
+| 0.1857        | 17.1932 | 4900 | 0.1718          | 0.1230 | 0.7757 | {'rouge1': 0.9026836932266615, 'rouge2': 0.833487216136216, 'rougeL': 0.9022265966641445, 'rougeLsum': 0.9024265508303717}  |
+| 0.1858        | 17.5443 | 5000 | 0.1705          | 0.1204 | 0.7792 | {'rouge1': 0.9046938392928605, 'rouge2': 0.8378406365404705, 'rougeL': 0.904281513678646, 'rougeLsum': 0.9040395290033556}  |
+| 0.1838        | 17.8955 | 5100 | 0.1713          | 0.1222 | 0.7773 | {'rouge1': 0.9025388171823945, 'rouge2': 0.8339417592886358, 'rougeL': 0.901982659834949, 'rougeLsum': 0.902191073170105}   |
+| 0.1784        | 18.2458 | 5200 | 0.1710          | 0.1228 | 0.7741 | {'rouge1': 0.9027135961066619, 'rouge2': 0.8338803735375095, 'rougeL': 0.9021313465595333, 'rougeLsum': 0.9022241778623963} |
+| 0.1748        | 18.5970 | 5300 | 0.1700          | 0.1205 | 0.7803 | {'rouge1': 0.9028943462107296, 'rouge2': 0.8360887354751103, 'rougeL': 0.9028049282097476, 'rougeLsum': 0.9026754818313738} |
+| 0.1785        | 18.9482 | 5400 | 0.1683          | 0.1191 | 0.7827 | {'rouge1': 0.9058615541527929, 'rouge2': 0.8397686128782502, 'rougeL': 0.905380338988193, 'rougeLsum': 0.9053818296865667}  |
+| 0.1715        | 19.2985 | 5500 | 0.1693          | 0.1197 | 0.7813 | {'rouge1': 0.9042951746231659, 'rouge2': 0.836346216169132, 'rougeL': 0.9038200107993153, 'rougeLsum': 0.9038172163450839}  |
+| 0.1743        | 19.6497 | 5600 | 0.1656          | 0.1198 | 0.7820 | {'rouge1': 0.9056265375031829, 'rouge2': 0.8389744056310899, 'rougeL': 0.9052424864181929, 'rougeLsum': 0.9052107898473516} |
+| 0.179         | 20.0    | 5700 | 0.1662          | 0.1200 | 0.7813 | {'rouge1': 0.9049957724375504, 'rouge2': 0.838664745594604, 'rougeL': 0.9047378933290081, 'rougeLsum': 0.9047796552290915}  |
+| 0.1705        | 20.3512 | 5800 | 0.1671          | 0.1158 | 0.7875 | {'rouge1': 0.9066988025993499, 'rouge2': 0.84144312190399, 'rougeL': 0.9066516406825784, 'rougeLsum': 0.9064192832523982}   |
+| 0.1737        | 20.7024 | 5900 | 0.1668          | 0.1191 | 0.7809 | {'rouge1': 0.9044882408830994, 'rouge2': 0.8379831752266652, 'rougeL': 0.9044191300138034, 'rougeLsum': 0.9041602420355832} |
+| 0.161         | 21.0527 | 6000 | 0.1675          | 0.1176 | 0.7855 | {'rouge1': 0.9057981628991186, 'rouge2': 0.8402890351927816, 'rougeL': 0.9055204210299842, 'rougeLsum': 0.9055949937515819} |
+| 0.1634        | 21.4039 | 6100 | 0.1656          | 0.1172 | 0.7849 | {'rouge1': 0.9052591361441602, 'rouge2': 0.8390947395192179, 'rougeL': 0.904901443457665, 'rougeLsum': 0.9049274938729717}  |
+| 0.1717        | 21.7550 | 6200 | 0.1655          | 0.1184 | 0.7850 | {'rouge1': 0.9056855411229652, 'rouge2': 0.8391993079429194, 'rougeL': 0.9051256484322957, 'rougeLsum': 0.9054924942388913} |
+| 0.1532        | 22.1054 | 6300 | 0.1640          | 0.1138 | 0.7895 | {'rouge1': 0.908514689307546, 'rouge2': 0.8432231540949471, 'rougeL': 0.9081112639646672, 'rougeLsum': 0.908272589564951}   |
+| 0.167         | 22.4565 | 6400 | 0.1626          | 0.1140 | 0.7921 | {'rouge1': 0.9103164260947302, 'rouge2': 0.847605808196412, 'rougeL': 0.9098933635969182, 'rougeLsum': 0.9098438675661122}  |
+| 0.1606        | 22.8077 | 6500 | 0.1632          | 0.1115 | 0.7956 | {'rouge1': 0.9112987341936316, 'rouge2': 0.8492302659397843, 'rougeL': 0.9111470792331513, 'rougeLsum': 0.9109445157457727} |
+| 0.1599        | 23.1580 | 6600 | 0.1642          | 0.1108 | 0.7960 | {'rouge1': 0.9118713013856634, 'rouge2': 0.848580538085876, 'rougeL': 0.9114618012803377, 'rougeLsum': 0.9116046463908325}  |
+| 0.156         | 23.5092 | 6700 | 0.1634          | 0.1122 | 0.7950 | {'rouge1': 0.9106394674615637, 'rouge2': 0.8474609350248357, 'rougeL': 0.910065118851396, 'rougeLsum': 0.9100662823980902}  |
+| 0.1589        | 23.8604 | 6800 | 0.1625          | 0.1130 | 0.7916 | {'rouge1': 0.9103350470563911, 'rouge2': 0.8474738385027315, 'rougeL': 0.9099995608863516, 'rougeLsum': 0.9100397914660747} |
+| 0.1622        | 24.2107 | 6900 | 0.1626          | 0.1117 | 0.7943 | {'rouge1': 0.9133839350790938, 'rouge2': 0.8511081438025545, 'rougeL': 0.9126575984413424, 'rougeLsum': 0.9126505650621592} |
+| 0.1521        | 24.5619 | 7000 | 0.1618          | 0.1109 | 0.7963 | {'rouge1': 0.912211680613469, 'rouge2': 0.8496181639239891, 'rougeL': 0.9117487343663472, 'rougeLsum': 0.9117499155750092}  |
+| 0.1503        | 24.9131 | 7100 | 0.1612          | 0.1115 | 0.7945 | {'rouge1': 0.9119319650927245, 'rouge2': 0.8489304675858942, 'rougeL': 0.9115897952726388, 'rougeLsum': 0.9116405544813904} |
+| 0.1504        | 25.2634 | 7200 | 0.1621          | 0.1103 | 0.7957 | {'rouge1': 0.9121880227389143, 'rouge2': 0.8492463401850738, 'rougeL': 0.9118447435959087, 'rougeLsum': 0.9118033384400608} |
+| 0.1519        | 25.6146 | 7300 | 0.1615          | 0.1118 | 0.7931 | {'rouge1': 0.9112498683244998, 'rouge2': 0.8480832804563686, 'rougeL': 0.9107105272229861, 'rougeLsum': 0.9107937803611704} |
+| 0.1479        | 25.9658 | 7400 | 0.1611          | 0.1098 | 0.7974 | {'rouge1': 0.9136242054251107, 'rouge2': 0.8509392563862166, 'rougeL': 0.9130826085424906, 'rougeLsum': 0.9132200366521013} |
+| 0.1437        | 26.3161 | 7500 | 0.1609          | 0.1099 | 0.7958 | {'rouge1': 0.9130313007924953, 'rouge2': 0.8502201629741937, 'rougeL': 0.9126113638303235, 'rougeLsum': 0.9126553187114919} |
+| 0.148         | 26.6673 | 7600 | 0.1609          | 0.1095 | 0.7968 | {'rouge1': 0.9137297115874194, 'rouge2': 0.8504308144873894, 'rougeL': 0.9133787545300738, 'rougeLsum': 0.9133679379590172} |
+| 0.1568        | 27.0176 | 7700 | 0.1607          | 0.1107 | 0.7945 | {'rouge1': 0.9131462089688931, 'rouge2': 0.8499313234434469, 'rougeL': 0.912728356326638, 'rougeLsum': 0.9129359515643884}  |
+| 0.1415        | 27.3687 | 7800 | 0.1607          | 0.1089 | 0.7971 | {'rouge1': 0.9137517387960785, 'rouge2': 0.8501495668327003, 'rougeL': 0.913338099417192, 'rougeLsum': 0.9136064287202272}  |
+| 0.1496        | 27.7199 | 7900 | 0.1597          | 0.1095 | 0.7966 | {'rouge1': 0.9130600978677069, 'rouge2': 0.8504810140358678, 'rougeL': 0.9127735778942226, 'rougeLsum': 0.9125405596281091} |
+| 0.1492        | 28.0702 | 8000 | 0.1596          | 0.1103 | 0.7955 | {'rouge1': 0.9120508555985649, 'rouge2': 0.8487431673284105, 'rougeL': 0.9118153720293241, 'rougeLsum': 0.9116675598551898} |
+| 0.1433        | 28.4214 | 8100 | 0.1600          | 0.1097 | 0.7967 | {'rouge1': 0.912271157742592, 'rouge2': 0.8491645313481614, 'rougeL': 0.9119383068792002, 'rougeLsum': 0.9120338521552568}  |
+| 0.1439        | 28.7726 | 8200 | 0.1591          | 0.1080 | 0.7995 | {'rouge1': 0.9144489684474263, 'rouge2': 0.8511979198969957, 'rougeL': 0.9137012695418792, 'rougeLsum': 0.9139716479376014} |
+| 0.1325        | 29.1229 | 8300 | 0.1598          | 0.1085 | 0.7984 | {'rouge1': 0.9139270928380065, 'rouge2': 0.8507649128122179, 'rougeL': 0.9135264918778438, 'rougeLsum': 0.9135318897680185} |
+| 0.1416        | 29.4741 | 8400 | 0.1595          | 0.1077 | 0.7997 | {'rouge1': 0.9146277490504104, 'rouge2': 0.8513032425483812, 'rougeL': 0.9140852902087768, 'rougeLsum': 0.9140094297264637} |
+| 0.1451        | 29.8253 | 8500 | 0.1592          | 0.1081 | 0.7992 | {'rouge1': 0.9142461267493629, 'rouge2': 0.8512223456977356, 'rougeL': 0.9140461108455781, 'rougeLsum': 0.9139759112519872} |
+### Framework versions
+- Transformers 4.49.0
+- Pytorch 2.6.0+cu124
+- Datasets 3.2.0
+- Tokenizers 0.21.0

adapter.fr.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7044beb0469c98c1c6dbe2203c63c7f8115e83d27594e9d66b14c96f710a9b63
+size 8880524

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:02b42d33c1379d71f8d785de84e9e59064a41d798e51b5cebde7ddbf2dd4444c
 size 3858972908

 version https://git-lfs.github.com/spec/v1
+oid sha256:61556ab8b544ea1bfca5fafb50a4ab8ed1477665798920d251764e55bee4905d
 size 3858972908