pakawadeep/mt5-small-finetuned-ctfl

This model is a fine-tuned version of google/mt5-small on CTFL-GEC dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
training_precision: float32

Train Loss	Validation Loss	Train Rouge1	Train Rouge2	Train Rougel	Train Rougelsum	Train Gen Len	Epoch
2.0267	1.4685	8.4866	2.1287	8.6987	8.4158	11.8317	0
1.8791	1.4054	8.4866	2.1287	8.6987	8.4158	11.7772	1
1.7619	1.4034	8.4866	2.1287	8.6987	8.4158	11.8069	2
1.6707	1.3687	8.9463	2.1287	9.1938	8.9816	11.8762	3
1.5788	1.3420	8.4866	2.1287	8.6987	8.4866	11.9059	4
1.5039	1.3403	8.4866	2.1287	8.6987	8.4866	11.9158	5
1.4301	1.3176	8.4866	2.1287	8.6987	8.4866	11.9307	6
1.3983	1.3101	8.6634	2.3102	8.7871	8.6634	11.9257	7
1.3550	1.2941	8.7694	2.2772	8.9816	8.7694	11.9356	8
1.3139	1.2659	8.7694	2.2772	8.9816	8.7694	11.9257	9
1.2710	1.2536	8.7694	2.2772	8.9816	8.7694	11.9257	10
1.2479	1.2394	8.7694	2.2772	8.9816	8.7694	11.9257	11
1.2359	1.2252	8.7694	2.2772	8.9816	8.7694	11.9406	12
1.2031	1.2193	8.7694	2.2772	8.9816	8.7694	11.9307	13
1.1813	1.1963	8.7694	2.2772	8.9816	8.7694	11.9455	14
1.1556	1.1897	8.7694	2.2772	8.9816	8.7694	11.9455	15
1.1242	1.1786	8.7694	2.2772	8.9816	8.7694	11.9406	16
1.1060	1.1575	8.7694	2.2772	8.9816	8.7694	11.9554	17
1.0808	1.1620	8.7694	2.2772	8.9816	8.7694	11.9505	18
1.0620	1.1564	8.7694	2.2772	8.9816	8.7694	11.9554	19
1.0489	1.1491	8.2744	1.8812	8.4866	8.2744	11.9505	20
1.0313	1.1292	8.2744	1.8812	8.4866	8.2744	11.9554	21
1.0104	1.1294	8.2744	1.8812	8.4866	8.2744	11.9802	22
0.9917	1.1190	8.4866	1.8812	8.7341	8.5219	11.9505	23
0.9642	1.1348	8.2390	1.3861	8.5219	8.2744	11.9406	24
0.9629	1.1197	8.2390	1.3861	8.5219	8.2744	11.9406	25
0.9512	1.1060	8.2390	1.3861	8.5219	8.2744	11.9505	26
0.9386	1.1006	8.2390	1.3861	8.5219	8.2744	11.9505	27