telugu-news-headline-generation / README.md

Update README.md

8dfd9c8 verified 4 months ago

7.8 kB

	---
	base_model:
	- google/gemma-2-2b-it
	datasets:
	- saidines12/telugu_news_dataset
	---



	# Model Card for Gemma-2-2B-it Telugu News Headline Generator

	This model is a fine-tuned version of Google's Gemma-2-2B Instruction model, optimized for generating Telugu news headlines from article content. It has been trained using Supervised Fine-Tuning (SFT) on a Telugu news dataset.

	## Model Details

	### Model Description

	- Developed by: Google (base model) with Telugu news fine-tuning
	- Model type: Decoder-only transformer language model
	- Language(s): Telugu
	- License: Apache 2.0
	- Finetuned from model: Gemma-2-2B

	### Model Sources
	- Repository: Hugging Face Hub
	- Base Model: google/gemma-2-2b-it


	## How to Get Started with the Model

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation")
	tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")

	text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
	inputs = tokenizer(text, return_tensors="pt")
	outputs = model.generate(**inputs)
	headline = tokenizer.decode(outputs[0], skip_special_tokens=True)
	```

	## Training Details

	### Training Data
	- Telugu news articles and headlines dataset
	- Data cleaned and preprocessed for headline generation task
	- Articles spanning various news categories

	### Training Procedure

	#### Training Hyperparameters
	- Training regime: FP16 mixed precision
	- Batch size: 6 per device
	- Gradient accumulation steps: 4
	- Learning rate: 2e-4
	- Maximum steps: 20,000
	- Warmup steps: 25
	- Optimizer: AdamW
	- Evaluation strategy: Every 20000 steps

	#### Hardware Specifications
	- GPU training with gradient checkpointing
	- Parallel data loading with 8 workers

	I'll help you add the evaluation information to your markdown file in a clearer tabular format.

	Here's how you can structure the evaluation section:

	## Evaluation

	### ROUGE Score Comparison

	\| Metric \| Base Model \| Finetuned Model \| Improvement \|
	\|---------\|------------\|-----------------\|-------------\|
	\| ROUGE-1 \| 3.39 \| 4.64 \| +1.26 \|
	\| ROUGE-2 \| 0.26 \| 0.41 \| +0.14 \|
	\| ROUGE-L \| 3.38 \| 4.63 \| +1.25 \|

	### Model Prediction Comparison using Bigger model for evaluation

	\| Category \| Count \| Percentage \|
	\|-------------------\|-------\|------------\|
	\| Total samples \| 5962 \| 100% \|
	\| Same predictions \| 3 \| 0.05% \|
	\| Better predictions\| 4610 \| 77.32% \|
	\| Worse predictions \| 1349 \| 22.63% \|

	### Evaluation Methods
	- ROUGE scores for headline similarity
	- Bigger model's custom metrics for headline appropriateness and relativeness


	## Inference

	#### Running the model on a GPU using different precisions

	* _Using `torch.float16`_

	```python
	# pip install accelerate
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
	model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", device_map="auto", revision="float16")

	input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
	input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

	outputs = model.generate(**input_ids)
	print(tokenizer.decode(outputs[0]))
	```

	* _Using `torch.bfloat16`_

	```python
	# pip install accelerate
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
	model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", device_map="auto", torch_dtype=torch.bfloat16)

	input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
	input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

	outputs = model.generate(**input_ids)
	print(tokenizer.decode(outputs[0]))
	```

	#### Quantized Versions through `bitsandbytes`

	* _Using 8-bit precision (int8)_

	```python
	# pip install bitsandbytes accelerate
	from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

	quantization_config = BitsAndBytesConfig(load_in_8bit=True)

	tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
	model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", quantization_config=quantization_config)

	input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
	input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

	outputs = model.generate(**input_ids)
	print(tokenizer.decode(outputs[0]))
	```

	* _Using 4-bit precision_

	```python
	# pip install bitsandbytes accelerate
	from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

	quantization_config = BitsAndBytesConfig(load_in_4bit=True)

	tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
	model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", quantization_config=quantization_config)

	input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
	input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

	outputs = model.generate(**input_ids)
	print(tokenizer.decode(outputs[0]))
	```


	#### Other optimizations

	* _Flash Attention 2_

	First make sure to install `flash-attn` in your environment `pip install flash-attn`

	```diff
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	torch_dtype=torch.float16,
	+ attn_implementation="flash_attention_2"
	).to(0)
	```

	### Inputs and outputs

	* Input: Text string, such as a question, a prompt, or a document to be
	summarized.
	* Output: Generated English-language text in response to the input, such
	as an answer to a question, or a summary of a document.



	## Technical Specifications

	### Model Architecture and Objective
	- Base architecture: Gemma-2
	- Training objective: Supervised fine-tuning for headline generation
	- Gradient checkpointing enabled for memory efficiency
	- Optimized data loading with pinned memory

	### Software
	- PyTorch
	- Transformers library
	- TRL for supervised fine-tuning
	- CUDA for GPU acceleration

	## Uses

	### Direct Use
	This model is designed for generating Telugu news headlines from article content. It can be used by:
	- News organizations for automated headline generation
	- Content creators working with Telugu news content
	- Researchers studying Telugu natural language generation

	### Out-of-Scope Use
	- The model should not be used for generating fake news or misleading headlines
	- Not suitable for non-Telugu content
	- Not designed for general text generation tasks
	- Should not be used for classification or other non-headline generation tasks

	## Bias, Risks, and Limitations
	- May reflect biases present in Telugu news media
	- Performance may vary based on news domain and writing style
	- Limited to the vocabulary and patterns present in the training data
	- May occasionally generate grammatically incorrect Telugu text
	- Could potentially generate sensationalized headlines

	### Recommendations
	- Use with human oversight for published content
	- Verify generated headlines for accuracy
	- Monitor output for potential biases
	- Implement content filtering for inappropriate generations