saidines12's picture
Update README.md
8dfd9c8 verified
---
base_model:
- google/gemma-2-2b-it
datasets:
- saidines12/telugu_news_dataset
---
# Model Card for Gemma-2-2B-it Telugu News Headline Generator
This model is a fine-tuned version of Google's Gemma-2-2B Instruction model, optimized for generating Telugu news headlines from article content. It has been trained using Supervised Fine-Tuning (SFT) on a Telugu news dataset.
## Model Details
### Model Description
- **Developed by:** Google (base model) with Telugu news fine-tuning
- **Model type:** Decoder-only transformer language model
- **Language(s):** Telugu
- **License:** Apache 2.0
- **Finetuned from model:** Gemma-2-2B
### Model Sources
- **Repository:** Hugging Face Hub
- **Base Model:** google/gemma-2-2b-it
## How to Get Started with the Model
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation")
tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs)
headline = tokenizer.decode(outputs[0], skip_special_tokens=True)
```
## Training Details
### Training Data
- Telugu news articles and headlines dataset
- Data cleaned and preprocessed for headline generation task
- Articles spanning various news categories
### Training Procedure
#### Training Hyperparameters
- **Training regime:** FP16 mixed precision
- **Batch size:** 6 per device
- **Gradient accumulation steps:** 4
- **Learning rate:** 2e-4
- **Maximum steps:** 20,000
- **Warmup steps:** 25
- **Optimizer:** AdamW
- **Evaluation strategy:** Every 20000 steps
#### Hardware Specifications
- GPU training with gradient checkpointing
- Parallel data loading with 8 workers
I'll help you add the evaluation information to your markdown file in a clearer tabular format.
Here's how you can structure the evaluation section:
## Evaluation
### ROUGE Score Comparison
| Metric | Base Model | Finetuned Model | Improvement |
|---------|------------|-----------------|-------------|
| ROUGE-1 | 3.39 | 4.64 | +1.26 |
| ROUGE-2 | 0.26 | 0.41 | +0.14 |
| ROUGE-L | 3.38 | 4.63 | +1.25 |
### Model Prediction Comparison using Bigger model for evaluation
| Category | Count | Percentage |
|-------------------|-------|------------|
| Total samples | 5962 | 100% |
| Same predictions | 3 | 0.05% |
| Better predictions| 4610 | 77.32% |
| Worse predictions | 1349 | 22.63% |
### Evaluation Methods
- ROUGE scores for headline similarity
- Bigger model's custom metrics for headline appropriateness and relativeness
## Inference
#### Running the model on a GPU using different precisions
* _Using `torch.float16`_
```python
# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", device_map="auto", revision="float16")
input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
```
* _Using `torch.bfloat16`_
```python
# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", device_map="auto", torch_dtype=torch.bfloat16)
input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
```
#### Quantized Versions through `bitsandbytes`
* _Using 8-bit precision (int8)_
```python
# pip install bitsandbytes accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(load_in_8bit=True)
tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", quantization_config=quantization_config)
input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
```
* _Using 4-bit precision_
```python
# pip install bitsandbytes accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(load_in_4bit=True)
tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", quantization_config=quantization_config)
input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
```
#### Other optimizations
* _Flash Attention 2_
First make sure to install `flash-attn` in your environment `pip install flash-attn`
```diff
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
+ attn_implementation="flash_attention_2"
).to(0)
```
### Inputs and outputs
* **Input:** Text string, such as a question, a prompt, or a document to be
summarized.
* **Output:** Generated English-language text in response to the input, such
as an answer to a question, or a summary of a document.
## Technical Specifications
### Model Architecture and Objective
- Base architecture: Gemma-2
- Training objective: Supervised fine-tuning for headline generation
- Gradient checkpointing enabled for memory efficiency
- Optimized data loading with pinned memory
### Software
- PyTorch
- Transformers library
- TRL for supervised fine-tuning
- CUDA for GPU acceleration
## Uses
### Direct Use
This model is designed for generating Telugu news headlines from article content. It can be used by:
- News organizations for automated headline generation
- Content creators working with Telugu news content
- Researchers studying Telugu natural language generation
### Out-of-Scope Use
- The model should not be used for generating fake news or misleading headlines
- Not suitable for non-Telugu content
- Not designed for general text generation tasks
- Should not be used for classification or other non-headline generation tasks
## Bias, Risks, and Limitations
- May reflect biases present in Telugu news media
- Performance may vary based on news domain and writing style
- Limited to the vocabulary and patterns present in the training data
- May occasionally generate grammatically incorrect Telugu text
- Could potentially generate sensationalized headlines
### Recommendations
- Use with human oversight for published content
- Verify generated headlines for accuracy
- Monitor output for potential biases
- Implement content filtering for inappropriate generations