news-summarizer-t5

This model is a fine-tuned version of google-t5/t5-small on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 8
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Rouge1	Rouge2	Rougel	Rougelsum
0.8689	1.0	251	0.6581	0.0049	18.8745	16.2314	18.1991	18.3287
0.6629	2.0	502	0.6385	0.0049	19.3705	17.1277	18.8685	18.9594
0.6114	3.0	753	0.6294	0.0049	19.3951	17.2113	18.9315	18.9848
0.571	4.0	1004	0.6197	0.0049	19.8684	17.8234	19.4646	19.5401
0.5451	5.0	1255	0.6193	0.0049	19.8981	17.9851	19.5083	19.5177
0.5194	6.0	1506	0.6203	0.0049	19.8675	17.9521	19.5434	19.6046
0.4894	7.0	1757	0.6166	0.0049	19.8622	17.9616	19.4791	19.5669
0.4872	8.0	2008	0.6177	0.0049	19.8849	17.9939	19.5328	19.5918