mahwizzzz
/

urdu_text_correction

+---
+library_name: transformers
+license: mit
+base_model: facebook/mbart-large-50
+tags:
+- generated_from_trainer
+metrics:
+- wer
+- bleu
+- rouge
+model-index:
+- name: urdu_text_correction
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# urdu_text_correction
+This model is a fine-tuned version of [facebook/mbart-large-50](https://huggingface.co/facebook/mbart-large-50) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4305
+- Wer: 0.1795
+- Cer: 0.0761
+- Bleu: 0.6996
+- Rouge1: 0.2025
+- Rouge2: 0.0699
+- Rougel: 0.2023
+- Meteor: 0.8296
+- Gen Len: 28.4033
+- Exact Match: 0.1096
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 32
+- eval_batch_size: 32
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 128
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 500
+- num_epochs: 3
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch  | Step  | Validation Loss | Wer    | Cer    | Bleu   | Rouge1 | Rouge2 | Rougel | Meteor | Gen Len | Exact Match |
+|:-------------:|:------:|:-----:|:---------------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:-------:|:-----------:|
+| 1.472         | 0.1209 | 500   | 1.2427          | 0.4582 | 0.2987 | 0.3494 | 0.1559 | 0.041  | 0.1562 | 0.5446 | 27.6629 | 0.0032      |
+| 0.9869        | 0.2419 | 1000  | 0.8662          | 0.3212 | 0.1755 | 0.5057 | 0.1779 | 0.0541 | 0.1778 | 0.6863 | 28.232  | 0.0227      |
+| 0.8582        | 0.3628 | 1500  | 0.7816          | 0.2878 | 0.1529 | 0.555  | 0.1837 | 0.0586 | 0.1838 | 0.7262 | 28.4877 | 0.0327      |
+| 0.7774        | 0.4837 | 2000  | 0.6885          | 0.257  | 0.1289 | 0.5881 | 0.1866 | 0.0603 | 0.1865 | 0.7504 | 27.897  | 0.0478      |
+| 0.6964        | 0.6047 | 2500  | 0.6298          | 0.2442 | 0.1172 | 0.6074 | 0.1896 | 0.0612 | 0.1894 | 0.7662 | 28.4579 | 0.0548      |
+| 0.6468        | 0.7256 | 3000  | 0.5851          | 0.224  | 0.1037 | 0.6326 | 0.1951 | 0.068  | 0.1952 | 0.7852 | 28.107  | 0.0676      |
+| 0.6148        | 0.8465 | 3500  | 0.5557          | 0.2224 | 0.1025 | 0.639  | 0.1935 | 0.0648 | 0.1935 | 0.7871 | 28.1589 | 0.0678      |
+| 0.5834        | 0.9675 | 4000  | 0.5342          | 0.2112 | 0.096  | 0.6535 | 0.1959 | 0.0638 | 0.1959 | 0.7989 | 28.1429 | 0.0769      |
+| 0.5252        | 1.0883 | 4500  | 0.5173          | 0.2035 | 0.091  | 0.662  | 0.197  | 0.068  | 0.1971 | 0.8044 | 28.2387 | 0.083       |
+| 0.5176        | 1.2092 | 5000  | 0.5023          | 0.2032 | 0.0911 | 0.6637 | 0.1982 | 0.0691 | 0.1985 | 0.8047 | 28.2411 | 0.0807      |
+| 0.5031        | 1.3301 | 5500  | 0.4873          | 0.1958 | 0.0846 | 0.6754 | 0.1969 | 0.0691 | 0.1969 | 0.8146 | 28.3568 | 0.0911      |
+| 0.4887        | 1.4511 | 6000  | 0.4771          | 0.1917 | 0.0836 | 0.6807 | 0.2003 | 0.0698 | 0.2002 | 0.8164 | 28.3507 | 0.0941      |
+| 0.4797        | 1.5720 | 6500  | 0.4696          | 0.1912 | 0.0833 | 0.6822 | 0.1998 | 0.0685 | 0.2002 | 0.8183 | 28.3144 | 0.0975      |
+| 0.4724        | 1.6929 | 7000  | 0.4599          | 0.1868 | 0.0802 | 0.6881 | 0.1992 | 0.0692 | 0.1992 | 0.8231 | 28.3751 | 0.1024      |
+| 0.4674        | 1.8139 | 7500  | 0.4532          | 0.1867 | 0.0804 | 0.6889 | 0.1998 | 0.0715 | 0.1999 | 0.823  | 28.4065 | 0.0996      |
+| 0.4548        | 1.9348 | 8000  | 0.4459          | 0.1826 | 0.0775 | 0.6952 | 0.2016 | 0.0704 | 0.2016 | 0.8268 | 28.3558 | 0.1071      |
+| 0.4109        | 2.0556 | 8500  | 0.4430          | 0.184  | 0.0783 | 0.6925 | 0.2016 | 0.0711 | 0.2016 | 0.8252 | 28.4034 | 0.1036      |
+| 0.4085        | 2.1766 | 9000  | 0.4400          | 0.1841 | 0.0789 | 0.6929 | 0.2016 | 0.0702 | 0.2015 | 0.8249 | 28.3683 | 0.1053      |
+| 0.4056        | 2.2975 | 9500  | 0.4370          | 0.1819 | 0.0771 | 0.6968 | 0.2006 | 0.0699 | 0.2005 | 0.8282 | 28.417  | 0.1077      |
+| 0.4005        | 2.4184 | 10000 | 0.4352          | 0.1823 | 0.0775 | 0.6968 | 0.2024 | 0.0704 | 0.2024 | 0.828  | 28.4263 | 0.1096      |
+| 0.4031        | 2.5394 | 10500 | 0.4324          | 0.1802 | 0.0762 | 0.6994 | 0.2031 | 0.0705 | 0.203  | 0.8293 | 28.4203 | 0.1096      |
+| 0.3984        | 2.6603 | 11000 | 0.4314          | 0.18   | 0.0766 | 0.699  | 0.2025 | 0.0705 | 0.2025 | 0.8292 | 28.3997 | 0.1096      |
+| 0.3924        | 2.7812 | 11500 | 0.4310          | 0.1802 | 0.0766 | 0.699  | 0.2019 | 0.0696 | 0.2018 | 0.8289 | 28.4055 | 0.1085      |
+| 0.3975        | 2.9022 | 12000 | 0.4305          | 0.1795 | 0.0761 | 0.6996 | 0.2025 | 0.0699 | 0.2023 | 0.8296 | 28.4033 | 0.1096      |
+### Framework versions
+- Transformers 4.49.0
+- Pytorch 2.6.0+cu118
+- Datasets 3.3.2
+- Tokenizers 0.21.0

generation_config.json CHANGED Viewed

@@ -1,5 +1,4 @@
 {
-  "_from_model_config": true,
   "bos_token_id": 0,
   "decoder_start_token_id": 2,
   "early_stopping": true,

 {
   "bos_token_id": 0,
   "decoder_start_token_id": 2,
   "early_stopping": true,