language: | |
- ru | |
- en | |
license: apache-2.0 | |
base_model: igorktech/rugpt3-joker-150k | |
tags: | |
- not-for-all-audiences | |
- art | |
- humour | |
- jokes | |
- generated_from_trainer | |
model-index: | |
- name: zeio/wit | |
results: [] | |
datasets: | |
- zeio/baneks | |
metrics: | |
- loss | |
widget: | |
- text: 'Купил мужик шляпу' | |
example_title: hat | |
- text: 'Пришла бабка к врачу' | |
example_title: doctor | |
- text: 'Нашел мужик подкову' | |
example_title: horseshoe | |
<p align="center"> | |
<img src="https://i.ibb.co/zP7j7ng/wit-logo.png"/> | |
</p> | |
# wit | |
This model is a fine-tuned version of [igorktech/rugpt3-joker-150k][base] on the [baneks][dataset] dataset for 10 epochs. It achieved `2.0391` overall loss during training. | |
Model evaluation has not been performed. | |
## Model description | |
The model is a fine-tuned variant of the [igorktech/rugpt3-joker-150k][base] architecture with causal language modeling head. | |
## Intended uses & limitations | |
The model should be used for studying abilities of natural language models to generate jokes. | |
## Training and evaluation data | |
The model is trained on a list of anecdotes pulled from a few vk communities (see [baneks][dataset] dataset for more details). | |
## Training procedure | |
### Training hyperparameters | |
The following hyperparameters were used during training: | |
- learning_rate: 0.0005 | |
- train_batch_size: 8 | |
- eval_batch_size: 16 | |
- seed: 42 | |
- gradient_accumulation_steps: 8 | |
- total_train_batch_size: 64 | |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 | |
- lr_scheduler_type: cosine | |
- lr_scheduler_warmup_steps: 1000 | |
- num_epochs: 10 | |
### Training results | |
| Train Loss | Epoch | | |
|:----------:|:-----:| | |
| 2.0391 | 10 | | |
### Framework versions | |
- Transformers 4.34.0 | |
- Pytorch 2.1.0 | |
- Datasets 2.12.0 | |
- Tokenizers 0.14.1 | |
[base]: https://huggingface.co/igorktech/rugpt3-joker-150k | |
[dataset]: https://huggingface.co/datasets/zeio/baneks | |