Pegasus SAMSum Fine-tuned Model

This is a fine-tuned version of the google/pegasus-samsum model, further trained on the SAMSum dataset for improved chat conversation summarization.

Model Details

Base Model: google/pegasus-samsum
Fine-tuning Dataset: SAMSum (Samsung AI Dataset)
Task: Abstractive Text Summarization
Language: English
License: MIT

Training Details

This model was fine-tuned using:

Custom training pipeline with PyTorch and Transformers
Optimized for chat conversation summarization
Enhanced performance on dialogue-based content

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Load model and tokenizer
model_name = "Ananthakr1shnan/pegasus-samsum-finetuned"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

# Example usage
dialogue = """
John: Hey Sarah, how was your day at work?
Sarah: Pretty good! Had a big presentation today.
John: How did it go?
Sarah: Really well actually. The client loved our proposal.
"""

# Tokenize and generate summary
inputs = tokenizer(dialogue, max_length=512, truncation=True, return_tensors="pt")
summary_ids = model.generate(inputs["input_ids"], max_length=50, min_length=10, length_penalty=2.0, num_beams=4)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

print(summary)

Applications

Chat Summarization: Summarize conversation threads
Meeting Notes: Extract key points from transcripts
Customer Support: Summarize support conversations
Email Threads: Condense long email chains

Performance

This fine-tuned model shows improved performance over the base model on:

Dialogue understanding
Key information extraction
Coherent summary generation
Context preservation

Author

Ananthakrishnan K

Email: [email protected]
Hugging Face: Ananthakr1shnan

Ananthakr1shnan
/

pegasus-samsum-finetuned