pakawadeep
/

ctfl-gec-th

@@ -2,79 +2,97 @@
 license: apache-2.0
 base_model: google/mt5-large
 tags:
 - generated_from_keras_callback
 model-index:
-- name: pakawadeep/mt5-large-finetuned-ctfl-augmented_1
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information Keras had access to. You should
-probably proofread and complete it, then remove this comment. -->
-# pakawadeep/mt5-large-finetuned-ctfl-augmented_1
-This model is a fine-tuned version of [google/mt5-large](https://huggingface.co/google/mt5-large) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Train Loss: 0.2041
-- Validation Loss: 0.7119
-- Train Rouge1: 8.6634
-- Train Rouge2: 0.6931
-- Train Rougel: 8.5691
-- Train Rougelsum: 8.6987
-- Train Gen Len: 11.9158
-- Epoch: 21
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
-The following hyperparameters were used during training:
-- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
-- training_precision: float32
-### Training results
-| Train Loss | Validation Loss | Train Rouge1 | Train Rouge2 | Train Rougel | Train Rougelsum | Train Gen Len | Epoch |
-|:----------:|:---------------:|:------------:|:------------:|:------------:|:---------------:|:-------------:|:-----:|
-| 3.7859     | 1.7737          | 3.8966       | 1.1818       | 3.8139       | 3.8868          | 12.8069       | 0     |
-| 1.7728     | 1.2922          | 6.8010       | 1.1881       | 6.7657       | 6.7657          | 11.7376       | 1     |
-| 1.3356     | 1.0734          | 7.3020       | 1.8152       | 7.1782       | 7.3020          | 11.9010       | 2     |
-| 1.1070     | 0.9405          | 8.2037       | 2.1782       | 7.9915       | 8.2037          | 12.0198       | 3     |
-| 0.9583     | 0.8494          | 8.2037       | 2.1782       | 7.9915       | 8.2037          | 11.9901       | 4     |
-| 0.8463     | 0.7866          | 9.0288       | 2.4257       | 8.8873       | 8.9109          | 11.9802       | 5     |
-| 0.7662     | 0.7320          | 8.9816       | 2.3762       | 8.7694       | 8.8755          | 11.8960       | 6     |
-| 0.6961     | 0.7024          | 8.7341       | 1.8812       | 8.6457       | 8.6987          | 11.9010       | 7     |
-| 0.6444     | 0.6952          | 8.7341       | 1.8812       | 8.6457       | 8.6987          | 11.9406       | 8     |
-| 0.5881     | 0.6612          | 8.2862       | 0.7921       | 8.2390       | 8.2744          | 11.8960       | 9     |
-| 0.5386     | 0.6746          | 8.4689       | 1.3861       | 8.4335       | 8.4512          | 11.9307       | 10    |
-| 0.4944     | 0.6473          | 8.4689       | 1.3861       | 8.4335       | 8.4512          | 11.9406       | 11    |
-| 0.4524     | 0.6328          | 7.7793       | 0.7921       | 7.7027       | 7.7558          | 11.9307       | 12    |
-| 0.4161     | 0.6521          | 8.4689       | 1.3861       | 8.4335       | 8.4512          | 11.9307       | 13    |
-| 0.3812     | 0.6311          | 8.2862       | 0.7921       | 8.2390       | 8.2744          | 11.9109       | 14    |
-| 0.3488     | 0.6368          | 8.2862       | 0.7921       | 8.2390       | 8.2744          | 11.8960       | 15    |
-| 0.3181     | 0.6449          | 8.7812       | 0.7921       | 8.6987       | 8.7930          | 11.9455       | 16    |
-| 0.2898     | 0.6495          | 8.8461       | 0.8911       | 8.7400       | 8.8637          | 11.9307       | 17    |
-| 0.2677     | 0.6583          | 8.8461       | 0.8911       | 8.7400       | 8.8637          | 11.9059       | 18    |
-| 0.2435     | 0.6823          | 8.8461       | 0.8911       | 8.7400       | 8.8637          | 11.9653       | 19    |
-| 0.2227     | 0.6897          | 8.6575       | 0.6931       | 8.5337       | 8.6693          | 11.9703       | 20    |
-| 0.2041     | 0.7119          | 8.6634       | 0.6931       | 8.5691       | 8.6987          | 11.9158       | 21    |
 ### Framework versions
 - Transformers 4.41.2
 - TensorFlow 2.15.0
 - Datasets 2.20.0
 - Tokenizers 0.19.1

 license: apache-2.0
 base_model: google/mt5-large
 tags:
+- thai
+- grammatical-error-correction
+- mt5
+- fine-tuned
+- l2-learners
 - generated_from_keras_callback
 model-index:
+- name: pakawadeep/ctfl-gec-th
+  results:
+    - task:
+        name: Grammatical Error Correction
+        type: text2text-generation
+      dataset:
+        name: CTFL-GEC (augmented with Self-Instruct 200%)
+        type: custom
+      metrics:
+        - name: Precision
+          type: precision
+          value: 0.47
+        - name: Recall
+          type: recall
+          value: 0.47
+        - name: F1
+          type: f1
+          value: 0.47
+        - name: F0.5
+          type: f0.5
+          value: 0.47
+        - name: BLEU
+          type: bleu
+          value: 0.69
+        - name: GLEU
+          type: gleu
+          value: 0.68
+        - name: CHRF
+          type: chrf
+          value: 0.87
+language:
+- th
 ---
+# pakawadeep/ctfl-gec-th
+This model is a fine-tuned version of [google/mt5-large](https://huggingface.co/google/mt5-large), trained for **Grammatical Error Correction (GEC)** in **Thai** for **L2 learners**. It was developed as part of the research *"Grammatical Error Correction for L2 Learners of Thai Using Large Language Models"*, and represents the best-performing model in the study.
 ## Model description
+This model is based on the mT5-large architecture and was fine-tuned on the CTFL-GEC dataset, which contains human-annotated grammatical error corrections from L2 Thai learners. To improve generalization, the dataset was augmented using the Self-Instruct method with 200% additional synthetic pairs.
+The model is capable of correcting sentence-level grammatical errors typical of L2 Thai writing, including issues with word order, omissions, and incorrect particles.
 ## Intended uses & limitations
+### Intended uses
+- Grammatical error correction for Thai language learners
+- Linguistic analysis of L2 learner errors
+- Research in low-resource GEC methods
+### Limitations
+- May not generalize to informal or dialectal Thai
+- Performance may degrade on sentence types or domains not represented in the training data
+- Designed for Thai GEC only; not optimized for multilingual correction tasks
 ## Training and evaluation data
+The model was fine-tuned on a combined dataset consisting of:
+- **CTFL-GEC**: A manually annotated corpus of Thai learner writing (370 writing samples, 4,200+ sentences)
+- **Self-Instruct augmentation (200%)**: Synthetic GEC pairs generated using LLM prompting
+Evaluation was conducted on a held-out portion of the human-annotated dataset using common GEC metrics.
 ## Training procedure
 ### Training hyperparameters
+- **Optimizer**: AdamWeightDecay
+- **Learning rate**: 2e-5
+- **Beta1/Beta2**: 0.9 / 0.999
+- **Epsilon**: 1e-7
+- **Weight decay**: 0.01
+- **Precision**: float32
 ### Framework versions
 - Transformers 4.41.2
 - TensorFlow 2.15.0
 - Datasets 2.20.0
 - Tokenizers 0.19.1
+## Citation
+If you use this model, please cite the associated thesis:
+```
+Pakawadee P. Chookwan, "Grammatical Error Correction for L2 Learners of Thai Using Large Language Models", 2025.
+```