distilgpt2-werther-finetuned

This is a DistilGPT2 model fine-tuned on "The Sorrows of Young Werther" by Johann Wolfgang von Goethe (specifically, on an English translation of the novel).

The goal of this fine-tuning project was to explore how a smaller language model like DistilGPT2 could capture the unique melancholic tone, vocabulary, and stylistic nuances of Werther's letters.

Model Description

The model began as a pre-trained DistilGPT2, a distilled version of GPT-2, known for its efficiency and good performance in text generation tasks. It was then adapted to the specific literary domain of Goethe's Werther through fine-tuning.

  • Base Model: distilgpt2
  • Fine-tuned on: "The Sorrows of Young Werther" (English translation)
  • Task: Causal Language Modeling (text generation)
  • Output: Generates text in a style reminiscent of the novel, often picking up on key phrases and emotional vocabulary.

How to Use

You can easily use this model with the Hugging Face transformers library:

from transformers import pipeline

# Replace 'your-username' with your actual Hugging Face username
generator = pipeline("text-generation", model="your-username/distilgpt2-werther-finetuned")

# Example usage:
prompt = "How happy I am that I am gone!"
generated_text = generator(prompt, max_new_tokens=100, num_return_sequences=1, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(generated_text[0]['generated_text'])

prompt = "My soul yearns for"
generated_text = generator(prompt, max_new_tokens=80, num_return_sequences=1, do_sample=True, temperature=0.6)
print(generated_text[0]['generated_text'])

Training Details

Hardware: Local NVIDIA GeForce RTX 4070 Laptop GPU (or similar)
Fine-tuning script: finetune_werther.py (available in the repository along with all python scripts to fine-tune distilgpt2)
Training epochs: 5 Dataset Size (approximately): ~57,987 tokens (from one novel)
Block size: 512 tokens
Training framework: Hugging Face transformers library with PyTorch backend.

Limitations and Bias

Repetitive Output: Due to the relatively small size of the fine-tuning dataset (a single novel) and the base model's architecture, the generated text can often become repetitive or loop on certain phrases. This is a common characteristic of smaller models fine-tuned on limited domain-specific data.

Lack of Long-Range Coherence: The model may struggle to maintain a coherent narrative or theme over longer generated passages.

Bias: The model will reflect any biases present in the original "The Sorrows of Young Werther" text. It will also reflect the melancholic, romantic, and somewhat obsessive tone of the original work.

Limited Knowledge: It only knows what was present in its pre-training data and what it learned from Werther. It does not have general world knowledge.

Future Work

Fine-tuning on a larger corpus of 18th-century romantic literature (in English) to improve fluency and reduce repetition.
Experimenting with different generation parameters (temperature, top_k, top_p) for more varied outputs.

Acknowledgements

Hugging Face Transformers library for providing the tools for fine-tuning.
Johann Wolfgang von Goethe for "The Sorrows of Young Werther."
The specific translation used for the fine-tuning:

Ebook: https://www.gutenberg.org/ebooks/2527

Author: Goethe, Johann Wolfgang von, 1749-1832
Translator: Boylan, R. Dillon (Richard Dillon), 1805?-1888
Uniform Title: Die Leiden des jungen Werther. English
Title: The Sorrows of Young Werther
Note: Wikipedia page about this book: https://en.wikipedia.org/wiki/The_Sorrows_of_Young_Werther
Note: Translation of: Die Leiden des jungen Werther.
Note: Reading ease score: 77.4 (7th grade). Fairly easy to read.
Credits: Produced by Michael Potter, Irene Potter, and David Widger
Language: English
LoC Class: PT: Language and Literatures: Germanic, Scandinavian, and Icelandic literatures
Subject: Germany -- Social life and customs -- Fiction
Subject: Unrequited love -- Fiction
Subject: Young men -- Germany -- Fiction
Category: Text
EBook-No.: 2527
Release Date: Feb 1, 2001
Most Recently Updated: Aug 12, 2024
Copyright Status: Public domain in the USA.
Downloads: 5123 downloads in the last 30 days.
Project Gutenberg eBooks are always free!

2025-06-11 11.41 AM

Downloads last month
27
Safetensors
Model size
81.9M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ajsbsd/distilgpt2-werther-finetuned

Finetuned
(745)
this model