distilbart-summarization-top

This model is a fine-tuned version of sshleifer/distilbart-cnn-12-6 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAFACTOR and the args are: No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 2.0
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
1.1609	0.3765	2000	1.0716
1.155	0.7529	4000	1.0546
1.0795	1.1293	6000	1.0495
1.0429	1.5058	8000	1.0466
1.0584	1.8823	10000	1.0451