NikkeS's picture
Update README.md
63b74d9 verified
---
language: en
tags:
- summarization
- bart
- lora
- fine-tuning
- agriculture
- huggingface
license: apache-2.0
datasets:
- gutenberg-books
library_name: transformers
---
# LoRA Fine-Tuned BART for Agricultural Text Summarization
## Model Overview
This is a **LoRA fine-tuned version** of `facebook/bart-large-cnn`, specialized for **summarizing agricultural texts**.
The model has been trained on processed **agricultural e-books** sourced from Project Gutenberg, using **Low-Rank Adaptation (LoRA)** for efficient fine-tuning.
Books used:
https://www.gutenberg.org/ebooks/56640
https://www.gutenberg.org/ebooks/67813
https://www.gutenberg.org/ebooks/20772
https://www.gutenberg.org/ebooks/40190
https://www.gutenberg.org/ebooks/4924
https://www.gutenberg.org/ebooks/4525
- **Base Model:** [facebook/bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn)
- **Fine-Tuning Method:** LoRA (Low-Rank Adaptation)
- **Dataset:** Processed agricultural e-books from Gutenberg
- **Primary Task:** Summarization
## Training Details
- **LoRA Configuration:**
- Rank (`r`): 8
- Alpha (`lora_alpha`): 16
- Dropout (`lora_dropout`): 0.1
- **Training Setup:**
- **Batch Size:** 8
- **Gradient Accumulation Steps:** 2
- **Learning Rate:** 2e-5
- **Epochs:** 3
- **Optimizer:** AdamW (bitsandbytes, if available)
- **Precision:** Mixed-precision (`fp16`)
- **Dataset Processing:**
- Texts were tokenized using the **BART tokenizer**.
- Chunking was performed using **LangChain Recursive Text Splitter** (max 300 words per chunk).
- Training pairs were created using **LLM-based summarization** (`meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo-p`).
## How to Use
### Load the Model in Transformers
```python
from transformers import pipeline
# Load fine-tuned model from Hugging Face
summarizer = pipeline("summarization", model="your_username/bart-large-lora-finetuned-agriculture")
# Sample text for summarization
text = "Crop rotation helps maintain soil health by alternating different crops each season."
# Generate summary
summary = summarizer(text, max_length=100, min_length=30, do_sample=False)[0]["summary_text"]
print("Summary:", summary)