ingu627
/

finetuned-flan-t5-dialogsum

Model card Files Files and versions

ingu627 commited on Feb 11

Commit

d922602

·

verified ·

1 Parent(s): c223fff

Update README.md

Files changed (1) hide show

README.md +99 -3

README.md CHANGED Viewed

@@ -1,3 +1,99 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+datasets:
+- knkarthick/dialogsum
+language:
+- en
+metrics:
+- rouge
+base_model:
+- google/flan-t5-small
+tags:
+- t5
+- flan
+- fine-tuned
+- instruction
+---
+# FLAN-T5-small Dialogue Summarization
+## Model Description
+Fine-tuned **FLAN-T5-small** model for dialogue summarization tasks using the DialogSum dataset. Achieves improved performance in generating concise summaries from conversational dialogues.
+## Training Data
+- **Dataset**: DialogSum (1,837 annotated dialogues)
+- **Preprocessing**: prompt_template = """
+Here is a dialogue:
+{dialogue}
+Write a short summary.
+{summary}
+"""
+Converted original dataset into instruction format with dialogue-summary pairs
+## Training Setup
+| Parameter | Value |
+|-----------|-------|
+| Base Model | google/flan-t5-small |
+| Epochs | 5 |
+| Batch Size | 16 (per device) |
+| Learning Rate | 3e-4 |
+| Optimizer | Adafactor |
+| Mixed Precision | fp16 |
+| Gradient Accumulation | 4 steps |
+| Max Length | 512 tokens |
+## Evaluation Results
+| Metric | Value |
+|--------|-------|
+| ROUGE-1 | 0.174 |
+| ROUGE-2 | 0.045 |
+| ROUGE-L | 0.135 |
+## Basic Inference
+```python
+from transformers import pipeline
+summarizer = pipeline(
+"text2text-generation",
+model="your_hf_username/your_model_name"
+)
+dialogue_example = """
+A: The router keeps disconnecting every hour.
+B: Have you tried firmware update?
+A: Not yet, how do I do that?
+B: Download latest version from our support site.
+"""
+summary = summarizer(
+f"Summarize this dialogue:\n{dialogue_example}\nSummary:",
+max_length=150,
+num_beams=3
+)['generated_text']
+print(summary)
+```
+## Training Procedure
+- **Hardware**: T4 GPU on Kaggle
+- **Framework**: PyTorch with Hugging Face Transformers
+- **Training Time**: ~45 minutes (Kaggle free tier)
+## Recommendations
+- Use beam search (num_beams=3-5) for better results
+- Combine with post-processing for formatting
+- Fine-tune longer for complex dialogues
+## Limitations
+- Struggles with multi-topic dialogues
+- May miss subtle contextual cues
+- Best performance on short conversations (<500 tokens)
+## License
+Apache 2.0 (Same as base FLAN-T5 model)
+## Citation
+@misc{dialogsum2021,
+title={DialogSum: A Real-Life Scenario Dialogue Summarization Dataset},
+author={Karthick Krishnamurthy},
+year={2021},
+howpublished={HuggingFace Datasets},
+}