|
--- |
|
license: mit |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- it5/datasets |
|
metrics: |
|
- rouge |
|
model-index: |
|
- name: it5-efficient-small-el32-st_g2r-0.0003 |
|
results: |
|
- task: |
|
name: Summarization |
|
type: summarization |
|
dataset: |
|
name: it5/datasets st_g2r |
|
type: it5/datasets |
|
args: st_g2r |
|
metrics: |
|
- name: Rouge1 |
|
type: rouge |
|
value: 29.8455 |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# it5-efficient-small-el32-st_g2r-0.0003 |
|
|
|
This model is a fine-tuned version of [stefan-it/it5-efficient-small-el32](https://huggingface.co/stefan-it/it5-efficient-small-el32) on the it5/datasets st_g2r dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 2.6892 |
|
- Rouge1: 29.8455 |
|
- Rouge2: 11.735 |
|
- Rougel: 26.6048 |
|
- Rougelsum: 26.8553 |
|
- Gen Len: 14.6131 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0003 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 10.0 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |
|
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:| |
|
| 3.2179 | 0.74 | 5000 | 2.7813 | 25.8006 | 9.3551 | 23.386 | 23.5287 | 13.5337 | |
|
| 2.9248 | 1.49 | 10000 | 2.6914 | 27.0409 | 10.0228 | 24.4581 | 24.6197 | 13.243 | |
|
| 2.6813 | 2.23 | 15000 | 2.6462 | 27.5333 | 10.3641 | 24.8696 | 25.0564 | 14.3052 | |
|
| 2.691 | 2.98 | 20000 | 2.6205 | 28.3681 | 10.8961 | 25.5144 | 25.722 | 14.5279 | |
|
| 2.5127 | 3.72 | 25000 | 2.6043 | 28.5979 | 11.0477 | 25.759 | 25.9605 | 14.0721 | |
|
| 2.3331 | 4.47 | 30000 | 2.6283 | 28.9106 | 11.3727 | 25.9338 | 26.1387 | 14.4519 | |
|
| 2.2034 | 5.21 | 35000 | 2.6400 | 29.099 | 11.2376 | 26.1221 | 26.3568 | 13.8715 | |
|
| 2.2137 | 5.96 | 40000 | 2.6340 | 29.2641 | 11.3565 | 26.2012 | 26.4214 | 14.5981 | |
|
| 2.1104 | 6.7 | 45000 | 2.6362 | 29.6204 | 11.6807 | 26.5976 | 26.8261 | 13.888 | |
|
| 2.003 | 7.45 | 50000 | 2.6541 | 29.5679 | 11.6334 | 26.5095 | 26.7418 | 14.2246 | |
|
| 1.8955 | 8.19 | 55000 | 2.6940 | 29.6748 | 11.5897 | 26.4862 | 26.7581 | 14.3902 | |
|
| 1.912 | 8.94 | 60000 | 2.6883 | 29.7285 | 11.6448 | 26.5368 | 26.7806 | 14.3574 | |
|
| 1.8581 | 9.68 | 65000 | 2.6874 | 29.7373 | 11.6532 | 26.4799 | 26.738 | 14.3821 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.15.0 |
|
- Pytorch 1.10.0+cu102 |
|
- Datasets 1.17.0 |
|
- Tokenizers 0.10.3 |
|
|