mt5-base-finetuned-xlsum-zh-en

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 8

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
6.7404	1.0	250	2.9751	11.0247	2.0376	8.8713	8.8683
3.7772	2.0	500	2.8667	13.3444	2.5848	10.5113	10.5307
3.489	3.0	750	2.8459	14.7042	3.2828	12.1022	12.0708
3.3125	4.0	1000	2.8361	14.7435	3.4802	11.9489	12.0336
3.2107	5.0	1250	2.8215	16.05	3.4239	13.6702	13.7339
3.1378	6.0	1500	2.8289	16.442	3.9602	13.7535	13.8249
3.0887	7.0	1750	2.8294	16.1141	3.4215	13.3288	13.4566
3.064	8.0	2000	2.8249	16.2375	3.5607	13.5455	13.6299