--- library_name: transformers license: cc-by-nc-4.0 base_model: facebook/mms-1b-all tags: - generated_from_trainer metrics: - wer model-index: - name: mms-1b-all-lg-GRAIN-v1 results: [] --- # mms-1b-all-lg-GRAIN-v1 This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.0539 - Wer: 0.0650 - Cer: 0.0121 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.001 - train_batch_size: 8 - eval_batch_size: 4 - seed: 42 - gradient_accumulation_steps: 2 - total_train_batch_size: 16 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 100 - num_epochs: 100 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Wer | Cer | |:-------------:|:-----:|:------:|:---------------:|:------:|:------:| | 0.6472 | 1.0 | 1385 | 0.1179 | 0.1626 | 0.0285 | | 0.3081 | 2.0 | 2770 | 0.1075 | 0.1525 | 0.0264 | | 0.2974 | 3.0 | 4155 | 0.1057 | 0.1517 | 0.0263 | | 0.2874 | 4.0 | 5540 | 0.0980 | 0.1374 | 0.0243 | | 0.2816 | 5.0 | 6925 | 0.0988 | 0.1357 | 0.0237 | | 0.2749 | 6.0 | 8310 | 0.0925 | 0.1258 | 0.0228 | | 0.27 | 7.0 | 9695 | 0.0896 | 0.1224 | 0.0218 | | 0.2637 | 8.0 | 11080 | 0.0856 | 0.1142 | 0.0207 | | 0.26 | 9.0 | 12465 | 0.0849 | 0.1200 | 0.0218 | | 0.2564 | 10.0 | 13850 | 0.0838 | 0.1079 | 0.0199 | | 0.2524 | 11.0 | 15235 | 0.0806 | 0.1100 | 0.0194 | | 0.2497 | 12.0 | 16620 | 0.0784 | 0.1115 | 0.0198 | | 0.2463 | 13.0 | 18005 | 0.0774 | 0.1069 | 0.0193 | | 0.2462 | 14.0 | 19390 | 0.0813 | 0.1083 | 0.0196 | | 0.2406 | 15.0 | 20775 | 0.0771 | 0.1021 | 0.0184 | | 0.2369 | 16.0 | 22160 | 0.0772 | 0.1017 | 0.0190 | | 0.235 | 17.0 | 23545 | 0.0740 | 0.0939 | 0.0179 | | 0.2313 | 18.0 | 24930 | 0.0735 | 0.0988 | 0.0178 | | 0.2297 | 19.0 | 26315 | 0.0743 | 0.1028 | 0.0184 | | 0.2265 | 20.0 | 27700 | 0.0724 | 0.0997 | 0.0178 | | 0.2229 | 21.0 | 29085 | 0.0728 | 0.0959 | 0.0175 | | 0.2205 | 22.0 | 30470 | 0.0709 | 0.0930 | 0.0171 | | 0.2194 | 23.0 | 31855 | 0.0677 | 0.0903 | 0.0166 | | 0.2159 | 24.0 | 33240 | 0.0681 | 0.0903 | 0.0163 | | 0.2155 | 25.0 | 34625 | 0.0694 | 0.0918 | 0.0170 | | 0.2133 | 26.0 | 36010 | 0.0679 | 0.0930 | 0.0171 | | 0.2103 | 27.0 | 37395 | 0.0713 | 0.0926 | 0.0167 | | 0.2076 | 28.0 | 38780 | 0.0665 | 0.0918 | 0.0164 | | 0.2067 | 29.0 | 40165 | 0.0679 | 0.0853 | 0.0160 | | 0.2068 | 30.0 | 41550 | 0.0640 | 0.0835 | 0.0155 | | 0.2035 | 31.0 | 42935 | 0.0644 | 0.0831 | 0.0157 | | 0.2015 | 32.0 | 44320 | 0.0646 | 0.0893 | 0.0162 | | 0.1993 | 33.0 | 45705 | 0.0656 | 0.0883 | 0.0159 | | 0.1976 | 34.0 | 47090 | 0.0637 | 0.0825 | 0.0151 | | 0.1957 | 35.0 | 48475 | 0.0620 | 0.0827 | 0.0152 | | 0.1944 | 36.0 | 49860 | 0.0616 | 0.0812 | 0.0153 | | 0.1929 | 37.0 | 51245 | 0.0604 | 0.0847 | 0.0153 | | 0.1899 | 38.0 | 52630 | 0.0624 | 0.0858 | 0.0153 | | 0.1897 | 39.0 | 54015 | 0.0621 | 0.0872 | 0.0156 | | 0.1888 | 40.0 | 55400 | 0.0609 | 0.0808 | 0.0154 | | 0.1872 | 41.0 | 56785 | 0.0627 | 0.0841 | 0.0151 | | 0.1845 | 42.0 | 58170 | 0.0602 | 0.0849 | 0.0151 | | 0.1857 | 43.0 | 59555 | 0.0629 | 0.0866 | 0.0157 | | 0.183 | 44.0 | 60940 | 0.0586 | 0.0775 | 0.0143 | | 0.1821 | 45.0 | 62325 | 0.0604 | 0.0856 | 0.0152 | | 0.1806 | 46.0 | 63710 | 0.0614 | 0.0835 | 0.0148 | | 0.1788 | 47.0 | 65095 | 0.0592 | 0.0818 | 0.0146 | | 0.1776 | 48.0 | 66480 | 0.0590 | 0.0804 | 0.0147 | | 0.1765 | 49.0 | 67865 | 0.0596 | 0.0796 | 0.0148 | | 0.175 | 50.0 | 69250 | 0.0571 | 0.0787 | 0.0145 | | 0.1733 | 51.0 | 70635 | 0.0586 | 0.0806 | 0.0148 | | 0.1738 | 52.0 | 72020 | 0.0577 | 0.0789 | 0.0141 | | 0.1702 | 53.0 | 73405 | 0.0579 | 0.0764 | 0.0146 | | 0.1702 | 54.0 | 74790 | 0.0592 | 0.0766 | 0.0142 | | 0.1689 | 55.0 | 76175 | 0.0555 | 0.0742 | 0.0138 | | 0.1671 | 56.0 | 77560 | 0.0572 | 0.0773 | 0.0141 | | 0.1658 | 57.0 | 78945 | 0.0557 | 0.0766 | 0.0140 | | 0.1659 | 58.0 | 80330 | 0.0550 | 0.0769 | 0.0142 | | 0.1643 | 59.0 | 81715 | 0.0539 | 0.0735 | 0.0131 | | 0.1641 | 60.0 | 83100 | 0.0537 | 0.0754 | 0.0135 | | 0.1623 | 61.0 | 84485 | 0.0543 | 0.0746 | 0.0133 | | 0.1608 | 62.0 | 85870 | 0.0537 | 0.0698 | 0.0131 | | 0.1587 | 63.0 | 87255 | 0.0567 | 0.0756 | 0.0132 | | 0.1594 | 64.0 | 88640 | 0.0571 | 0.0740 | 0.0131 | | 0.1576 | 65.0 | 90025 | 0.0557 | 0.0723 | 0.0135 | | 0.1579 | 66.0 | 91410 | 0.0564 | 0.0760 | 0.0138 | | 0.1575 | 67.0 | 92795 | 0.0579 | 0.0729 | 0.0135 | | 0.1555 | 68.0 | 94180 | 0.0571 | 0.0742 | 0.0136 | | 0.1547 | 69.0 | 95565 | 0.0557 | 0.0680 | 0.0127 | | 0.1537 | 70.0 | 96950 | 0.0567 | 0.0708 | 0.0132 | | 0.1516 | 71.0 | 98335 | 0.0569 | 0.0713 | 0.0131 | | 0.1505 | 72.0 | 99720 | 0.0573 | 0.0723 | 0.0133 | | 0.1491 | 73.0 | 101105 | 0.0569 | 0.0688 | 0.0125 | | 0.1495 | 74.0 | 102490 | 0.0579 | 0.0760 | 0.0134 | | 0.1491 | 75.0 | 103875 | 0.0560 | 0.0750 | 0.0132 | | 0.1479 | 76.0 | 105260 | 0.0551 | 0.0684 | 0.0126 | | 0.1464 | 77.0 | 106645 | 0.0539 | 0.0696 | 0.0124 | | 0.145 | 78.0 | 108030 | 0.0555 | 0.0702 | 0.0127 | | 0.1456 | 79.0 | 109415 | 0.0539 | 0.0679 | 0.0122 | | 0.1438 | 80.0 | 110800 | 0.0547 | 0.0677 | 0.0123 | | 0.1437 | 81.0 | 112185 | 0.0524 | 0.0661 | 0.0120 | | 0.1418 | 82.0 | 113570 | 0.0531 | 0.0679 | 0.0125 | | 0.1419 | 83.0 | 114955 | 0.0548 | 0.0688 | 0.0125 | | 0.1419 | 84.0 | 116340 | 0.0529 | 0.0646 | 0.0121 | | 0.1396 | 85.0 | 117725 | 0.0526 | 0.0686 | 0.0123 | | 0.1393 | 86.0 | 119110 | 0.0527 | 0.0684 | 0.0122 | | 0.1385 | 87.0 | 120495 | 0.0543 | 0.0682 | 0.0122 | | 0.1368 | 88.0 | 121880 | 0.0536 | 0.0665 | 0.0120 | | 0.1359 | 89.0 | 123265 | 0.0550 | 0.0667 | 0.0120 | | 0.1365 | 90.0 | 124650 | 0.0515 | 0.0646 | 0.0117 | | 0.1359 | 91.0 | 126035 | 0.0541 | 0.0669 | 0.0123 | | 0.1348 | 92.0 | 127420 | 0.0521 | 0.0659 | 0.0118 | | 0.1357 | 93.0 | 128805 | 0.0537 | 0.0669 | 0.0122 | | 0.1337 | 94.0 | 130190 | 0.0542 | 0.0696 | 0.0126 | | 0.1327 | 95.0 | 131575 | 0.0546 | 0.0638 | 0.0118 | | 0.133 | 96.0 | 132960 | 0.0539 | 0.0650 | 0.0121 | ### Framework versions - Transformers 4.47.0 - Pytorch 2.1.0+cu118 - Datasets 3.1.0 - Tokenizers 0.21.0