File size: 1,834 Bytes
3a898ce
9a5772b
 
 
3a898ce
 
ea44731
 
 
 
9a5772b
 
 
 
3a898ce
 
 
 
 
bc8ae33
3a898ce
88ac4d8
3a898ce
 
 
 
 
 
 
bc8ae33
3a898ce
ea44731
3a898ce
ea44731
3a898ce
 
ea44731
3a898ce
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ea44731
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
language:
- en
license: cc-by-sa-4.0
tags:
- generated_from_trainer
- text-generation-inference
datasets:
- Non-Residual-Prompting/C2Gen
pipeline_tag: text-generation
base_model: gpt2
model-index:
- name: gpt2-commongen-finetuned
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# gpt2-context_generator

This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2/) on [Non-Residual-Prompting/C2Gen](https://huggingface.co/datasets/Non-Residual-Prompting/C2Gen) dataset.

## Model description

More information needed

## Intended uses & limitations

- Check config.json for prompt template and sampling strategy.

### Dataset Summary

CommonGen [Lin et al., 2020](https://arxiv.org/abs/1911.03705) is a dataset for the constrained text generation task of word inclusion. But the task does not allow to include context. Therefore, to complement CommonGen, we provide an extended test set C2Gen [Carlsson et al., 2022](https://aclanthology.org/2022.acl-long.471) where an additional context is provided for each set of target words. The task is therefore reformulated to both generate commonsensical text which include the given words, and also have the generated text adhere to the given context.

## Training procedure
- Causal Language Modelling

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 9e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 8

### Framework versions

- Transformers 4.27.3
- Pytorch 1.13.1+cu116
- Datasets 2.13.1
- Tokenizers 0.13.2