saidines12
/

telugu-news-headline-generation

Safetensors

gemma2

Model card Files Files and versions Community

saidines12 commited on Jan 8

Commit

9a3700c

verified ·

1 Parent(s): 3e96fe6

Update README.md

Browse files

Files changed (1) hide show

README.md +112 -6

README.md CHANGED Viewed

@@ -1,6 +1,112 @@
----
-license: apache-2.0
-tags:
-- trl
-- sft
----

+---
+license: apache-2.0
+tags:
+- trl
+- sft
+- telugu
+---
+# Model Card for Gemma-2B Telugu News Headline Generator
+This model is a fine-tuned version of Google's Gemma-2B model, optimized for generating Telugu news headlines from article content. It has been trained using Supervised Fine-Tuning (SFT) on a Telugu news dataset.
+## Model Details
+### Model Description
+- **Developed by:** Google (base model) with Telugu news fine-tuning
+- **Model type:** Decoder-only transformer language model
+- **Language(s):** Telugu
+- **License:** Apache 2.0
+- **Finetuned from model:** Gemma-2B
+### Model Sources
+- **Repository:** Hugging Face Hub
+- **Base Model:** google/gemma-2b
+## Uses
+### Direct Use
+This model is designed for generating Telugu news headlines from article content. It can be used by:
+- News organizations for automated headline generation
+- Content creators working with Telugu news content
+- Researchers studying Telugu natural language generation
+### Out-of-Scope Use
+- The model should not be used for generating fake news or misleading headlines
+- Not suitable for non-Telugu content
+- Not designed for general text generation tasks
+- Should not be used for classification or other non-headline generation tasks
+## Bias, Risks, and Limitations
+- May reflect biases present in Telugu news media
+- Performance may vary based on news domain and writing style
+- Limited to the vocabulary and patterns present in the training data
+- May occasionally generate grammatically incorrect Telugu text
+- Could potentially generate sensationalized headlines
+### Recommendations
+- Use with human oversight for published content
+- Verify generated headlines for accuracy
+- Monitor output for potential biases
+- Implement content filtering for inappropriate generations
+## How to Get Started with the Model
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation")
+tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
+text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
+inputs = tokenizer(text, return_tensors="pt")
+outputs = model.generate(**inputs)
+headline = tokenizer.decode(outputs[0], skip_special_tokens=True)
+```
+## Training Details
+### Training Data
+- Telugu news articles and headlines dataset
+- Data cleaned and preprocessed for headline generation task
+- Articles spanning various news categories
+### Training Procedure
+#### Training Hyperparameters
+- **Training regime:** FP16 mixed precision
+- **Batch size:** 4 per device
+- **Gradient accumulation steps:** 4
+- **Learning rate:** 2e-4
+- **Maximum steps:** 30,000
+- **Warmup steps:** 25
+- **Optimizer:** AdamW
+- **Evaluation strategy:** Every 30000 steps
+#### Hardware Specifications
+- GPU training with gradient checkpointing
+- Parallel data loading with 8 workers
+## Evaluation
+### Metrics
+- ROUGE scores for headline similarity
+- Human evaluation for headline appropriateness
+## Technical Specifications
+### Model Architecture and Objective
+- Base architecture: Gemma-2B
+- Training objective: Supervised fine-tuning for headline generation
+- Gradient checkpointing enabled for memory efficiency
+- Optimized data loading with pinned memory
+### Software
+- PyTorch
+- Transformers library
+- TRL for supervised fine-tuning
+- CUDA for GPU acceleration