mt5-base-finetuned-xsum

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: nan
  • Rouge1: 1.5134
  • Rouge2: 0.2001
  • Rougel: 1.4917
  • Rougelsum: 1.4788
  • Gen Len: 8.6992

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.0 1.0 559 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 2.0 1118 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 3.0 1677 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 4.0 2236 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 5.0 2795 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 6.0 3354 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 7.0 3913 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 8.0 4472 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 9.0 5031 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 10.0 5590 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 11.0 6149 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 12.0 6708 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 13.0 7267 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 14.0 7826 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 15.0 8385 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 16.0 8944 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 17.0 9503 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 18.0 10062 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 19.0 10621 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 20.0 11180 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 21.0 11739 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 22.0 12298 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 23.0 12857 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 24.0 13416 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 25.0 13975 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 26.0 14534 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 27.0 15093 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 28.0 15652 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 29.0 16211 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 30.0 16770 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 31.0 17329 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 32.0 17888 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 33.0 18447 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 34.0 19006 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 35.0 19565 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 36.0 20124 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 37.0 20683 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 38.0 21242 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 39.0 21801 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 40.0 22360 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 41.0 22919 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 42.0 23478 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 43.0 24037 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 44.0 24596 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 45.0 25155 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 46.0 25714 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 47.0 26273 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 48.0 26832 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 49.0 27391 nan 1.5134 0.2001 1.4917 1.4788 8.6992
0.0 50.0 27950 nan 1.5134 0.2001 1.4917 1.4788 8.6992

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.5.1+cu118
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
7
Safetensors
Model size
582M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for guan06/mt5-base-finetuned-xsum

Base model

google/mt5-base
Finetuned
(181)
this model