Text Generation
Transformers
TensorFlow
Russian
English
gpt2
Not-For-All-Audiences
art
humour
jokes
generated_from_keras_callback
language: | |
- ru | |
- en | |
license: apache-2.0 | |
base_model: gpt2 | |
tags: | |
- not-for-all-audiences | |
- art | |
- humour | |
- jokes | |
- generated_from_keras_callback | |
model-index: | |
- name: zeio/fool | |
results: [] | |
datasets: | |
- zeio/baneks | |
metrics: | |
- loss | |
widget: | |
- text: 'Купил мужик шляпу' | |
example_title: hat | |
- text: 'Пришла бабка к врачу' | |
example_title: doctor | |
- text: 'Нашел мужик подкову' | |
example_title: horseshoe | |
<p align="center"> | |
<img src="https://i.ibb.co/S531946/fool-logo.png"/> | |
</p> | |
# fool | |
This model is a fine-tuned version of [gpt2][gpt2] on the [baneks][baneks] dataset for 1 epoch. It achieved `1.9752` loss during training. | |
Model evaluation has not been performed. | |
## Model description | |
The model is a fine-tuned variant of the base [gpt2][gpt2] architecture with causal language modeling head. | |
## Intended uses & limitations | |
The model should be used for studying abilities of natural language models to generate jokes. | |
## Training and evaluation data | |
The model is trained on a list of anecdotes pulled from a few vk communities (see [baneks][baneks] dataset for more details). | |
## Training procedure | |
### Training hyperparameters | |
The following hyperparameters were used during training: | |
- optimizer: | |
```json | |
{ | |
'name': 'AdamWeightDecay', | |
'learning_rate': { | |
'module': 'transformers.optimization_tf', | |
'class_name': 'WarmUp', | |
'config': { | |
'initial_learning_rate': 5e-05, | |
'decay_schedule_fn': { | |
'module': 'keras.optimizers.schedules', | |
'class_name': 'PolynomialDecay', | |
'config': { | |
'initial_learning_rate': 5e-05, | |
'decay_steps': 28462, | |
'end_learning_rate': 0.0, | |
'power': 1.0, | |
'cycle': False, | |
'name': None | |
}, | |
'registered_name': None | |
}, | |
'warmup_steps': 1000, | |
'power': 1.0, | |
'name': None | |
}, | |
'registered_name': 'WarmUp' | |
}, | |
'decay': 0.0, | |
'beta_1': 0.9, | |
'beta_2': 0.999, | |
'epsilon': 1e-08, | |
'amsgrad': False, | |
'weight_decay_rate': 0.01 | |
} | |
``` | |
- training_precision: `mixed_float16` | |
### Training results | |
| Train Loss | Epoch | | |
|:----------:|:-----:| | |
| 1.9752 | 0 | | |
### Framework versions | |
- Transformers 4.35.0.dev0 | |
- TensorFlow 2.14.0 | |
- Datasets 2.12.0 | |
- Tokenizers 0.14.1 | |
[baneks]: https://huggingface.co/datasets/zeio/baneks | |
[gpt2]: https://huggingface.co/gpt2 | |