Model description

flan-t5-large-samsum-qlora is an LLM model adapter and a fine-tuned version of google/flan-t5-large on the samsum dataset containing dialoges. Parameter-efficient fine-tuning with QLoRA was employed to fine-tune the base model. The model is intended for generative summarization tasks and achieved the following scores on the test dataset:

  • Rougue1: 49.249596%
  • Rouge2: 23.513032%
  • RougeL: 39.960812%
  • RougeLsum: 39.968438%

How to use

Load the model:

from peft import PeftModel, PeftConfig
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, BitsAndBytesConfig

# Load the peft adapter model config
peft_model_id = 'MuntasirHossain/flan-t5-large-samsum-qlora' 
peft_config = PeftConfig.from_pretrained(peft_model_id)

# load the base model and tokenizer
base_model = AutoModelForSeq2SeqLM.from_pretrained(peft_config.base_model_name_or_path,  load_in_8bit=True,  device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(peft_config.base_model_name_or_path)

# Load the peft model
model = PeftModel.from_pretrained(base_model, peft_model_id, device_map="auto")
model.eval()

Example Inference:

# random sample text from the samsum test dataset
text = """
Emma: Hi, we're going with Peter to Amiens tomorrow.
Daniel: oh! Cool.
Emma: Wanna join?
Daniel: Sure, I'm fed up with Paris.
Emma: We're too. The noise, traffic etc. Would be nice to see some countrysides.
Daniel: I don't think Amiens is exactly countrysides though :P
Emma: Nope. Hahahah. But not a megalopolis either!
Daniel: Right! Let's do it!
Emma: But we should leave early. The days are shorter now.
Daniel: Yes, the stupid winter time.
Emma: Exactly!
Daniel: Where should we meet then?
Emma: Come to my place by 9am.
Daniel: oohhh. It means I have to get up before 7!
Emma: Yup. The early bird gets the worm (in Amiens).
Daniel: You sound like my grandmother.
Emma: HAHAHA. I'll even add: no parties tonight, no drinking dear Daniel
Daniel: I really hope Amiens is worth it!
"""

input = tokenizer(text, return_tensors="pt")
outputs = model.generate(input_ids=input["input_ids"].cuda(), max_new_tokens=40) 
print("Summary: ", tokenizer.decode(outputs[0], skip_special_tokens=True))

Summary:  Emma and Peter are going to Amiens tomorrow. Daniel will join them. They will meet at Emma's place by 9 am. They will not have any parties tonight.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Framework versions

  • PEFT 0.8.2
  • Transformers 4.38.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.17.1
  • Tokenizers 0.15.2
Downloads last month
15
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for MuntasirHossain/flan-t5-large-samsum-qlora

Adapter
(150)
this model