raghavbali
commited on
Commit
•
25093e1
1
Parent(s):
b248c85
update model card
Browse files
README.md
CHANGED
@@ -12,21 +12,22 @@ model-index:
|
|
12 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
13 |
should probably proofread and complete it, then remove this comment. -->
|
14 |
|
15 |
-
#
|
16 |
|
17 |
-
This model is
|
|
|
18 |
|
19 |
## Model description
|
20 |
|
21 |
-
|
22 |
|
23 |
## Intended uses & limitations
|
24 |
|
25 |
-
|
26 |
|
27 |
## Training and evaluation data
|
28 |
|
29 |
-
|
30 |
|
31 |
## Training procedure
|
32 |
|
@@ -43,7 +44,8 @@ The following hyperparameters were used during training:
|
|
43 |
- num_epochs: 2
|
44 |
|
45 |
### Training results
|
46 |
-
|
|
|
47 |
|
48 |
|
49 |
### Framework versions
|
|
|
12 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
13 |
should probably proofread and complete it, then remove this comment. -->
|
14 |
|
15 |
+
# GPT2 Fine Tuned Headline Generator
|
16 |
|
17 |
+
- This model is trained on the [harvard/abcnews-dataset](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/SYBGZL) to generate news headlines
|
18 |
+
- This model is a fine-tuned version of [openai-community/gpt2-medium](https://huggingface.co/openai-community/gpt2-medium) on an unknown dataset.
|
19 |
|
20 |
## Model description
|
21 |
|
22 |
+
The model is fine-tuned for 2 epochs and 4k training samples from the abcnews dataset. This enables the model to generate news headline like text given a simple prompt
|
23 |
|
24 |
## Intended uses & limitations
|
25 |
|
26 |
+
This model is only for learning purposes only. The model easily hallucinates people names, locations and other artifacts & incidents.
|
27 |
|
28 |
## Training and evaluation data
|
29 |
|
30 |
+
The model leverages 2k test samples for evaluation
|
31 |
|
32 |
## Training procedure
|
33 |
|
|
|
44 |
- num_epochs: 2
|
45 |
|
46 |
### Training results
|
47 |
+
The final output after 2 epochs is as follows:
|
48 |
+
TrainOutput(global_step=130, training_loss=5.044873604407678, metrics={'train_runtime': 140.587, 'train_samples_per_second': 59.166, 'train_steps_per_second': 0.925, 'total_flos': 248723096358912.0, 'train_loss': 5.044873604407678, 'epoch': 2.0})
|
49 |
|
50 |
|
51 |
### Framework versions
|