learn3r
/

longt5_xl_gov_report_bp_10_continue

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

YAML Metadata Error: "base_model" with value "/home/co-ou1/rds/hpc-work/transformers/examples/pytorch/summarization/longt5_xl_gov_report_bp_10/checkpoint-477" is not valid. Use a model id from https://hf.co/models.

longt5_xl_gov_report_bp_10_continue

This model is a fine-tuned version of /home/co-ou1/rds/hpc-work/transformers/examples/pytorch/summarization/longt5_xl_gov_report_bp_10/checkpoint-477 on the learn3r/gov_report_memsum_oracle dataset. It achieves the following results on the evaluation set:

Loss: 1.4878
Rouge1: 71.9439
Rouge2: 43.7031
Rougel: 41.8301
Rougelsum: 69.1853
Gen Len: 833.0319

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 32
total_train_batch_size: 256
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
num_epochs: 4.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.6226	1.0	68	1.4878	71.9439	43.7031	41.8301	69.1853	833.0319
0.4983	1.99	136	1.5908	70.6191	43.2627	42.581	68.0871	627.5031
0.4175	2.99	204	1.6407	71.6704	43.1655	41.9746	68.992	737.4352
0.3958	3.99	272	1.8739	70.7685	42.5122	41.7454	68.0785	671.4938

Framework versions

Transformers 4.34.0.dev0
Pytorch 2.0.1+cu117
Datasets 2.14.5
Tokenizers 0.13.3

Downloads last month: 1

Inference Providers NEW

Text2Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Dataset used to train learn3r/longt5_xl_gov_report_bp_10_continue

Evaluation results

Rouge1 on learn3r/gov_report_memsum_oracle
self-reported

71.944

View on Papers With Code