saidines12
/

telugu-news-headline-generation

Safetensors

gemma2

Model card Files Files and versions Community

saidines12 commited on Jan 10

Commit

3691cb1

verified ·

1 Parent(s): 26e2627

Update README.md

Browse files

Files changed (1) hide show

README.md +119 -1

README.md CHANGED Viewed

@@ -89,13 +89,131 @@ headline = tokenizer.decode(outputs[0], skip_special_tokens=True)
 - GPU training with gradient checkpointing
 - Parallel data loading with 8 workers
 ## Evaluation
-### Metrics
 - ROUGE scores for headline similarity
 - Human evaluation for headline appropriateness
 ## Technical Specifications

 - GPU training with gradient checkpointing
 - Parallel data loading with 8 workers
+I'll help you add the evaluation information to your markdown file in a clearer tabular format.
+Here's how you can structure the evaluation section:
 ## Evaluation
+### ROUGE Score Comparison
+| Metric  | Base Model | Finetuned Model | Improvement |
+|---------|------------|-----------------|-------------|
+| ROUGE-1 | 2.85       | 4.67           | +1.82       |
+| ROUGE-2 | 0.25       | 0.41           | +0.17       |
+| ROUGE-L | 2.84       | 4.65           | +1.81       |
+### Model Prediction Comparison using Bigger model for evaluation
+| Category           | Count | Percentage |
+|-------------------|-------|------------|
+| Total samples     | 5962  | 100%       |
+| Same predictions  | 1     | 0.02%      |
+| Better predictions| 4697  | 78.78%     |
+| Worse predictions | 1264  | 21.20%     |
+### Evaluation Methods
 - ROUGE scores for headline similarity
 - Human evaluation for headline appropriateness
+## Inference
+#### Running the model on a GPU using different precisions
+* _Using `torch.float16`_
+```python
+# pip install accelerate
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
+model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", device_map="auto", revision="float16")
+input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
+input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
+outputs = model.generate(**input_ids)
+print(tokenizer.decode(outputs[0]))
+```
+* _Using `torch.bfloat16`_
+```python
+# pip install accelerate
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
+model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", device_map="auto", torch_dtype=torch.bfloat16)
+input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
+input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
+outputs = model.generate(**input_ids)
+print(tokenizer.decode(outputs[0]))
+```
+#### Quantized Versions through `bitsandbytes`
+* _Using 8-bit precision (int8)_
+```python
+# pip install bitsandbytes accelerate
+from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
+quantization_config = BitsAndBytesConfig(load_in_8bit=True)
+tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
+model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", quantization_config=quantization_config)
+input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
+input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
+outputs = model.generate(**input_ids)
+print(tokenizer.decode(outputs[0]))
+```
+* _Using 4-bit precision_
+```python
+# pip install bitsandbytes accelerate
+from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
+quantization_config = BitsAndBytesConfig(load_in_4bit=True)
+tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
+model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", quantization_config=quantization_config)
+input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
+input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
+outputs = model.generate(**input_ids)
+print(tokenizer.decode(outputs[0]))
+```
+#### Other optimizations
+* _Flash Attention 2_
+First make sure to install `flash-attn` in your environment `pip install flash-attn`
+```diff
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.float16,
++   attn_implementation="flash_attention_2"
+).to(0)
+```
+### Inputs and outputs
+*   **Input:** Text string, such as a question, a prompt, or a document to be
+    summarized.
+*   **Output:** Generated English-language text in response to the input, such
+    as an answer to a question, or a summary of a document.
 ## Technical Specifications