Mit1208
/

phi-2-universal-NER

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

phi-2-universal-NER / README.md

Mit1208's picture

Update README.md

65d51b2 verified over 1 year ago

|

history blame contribute delete

3.28 kB

	---
	library_name: peft
	tags:
	- generated_from_trainer
	base_model: microsoft/phi-2
	model-index:
	- name: phi-2-universal-NER
	results: []
	datasets:
	- Universal-NER/Pile-NER-type
	language:
	- en
	---

	# phi-2-universal-NER

	This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on the Universal-NER/Pile-NER-type dataset.

	## Model description

	This model shows power of small language model. We can finetune phi-2 on google colab free version. It's very simple and easy. I couldn't fine tuned whole model on free colab so used PEFT.

	## Intended uses & limitations

	This model is fine tuned from Phi-2 and UniversalNER dataset.

	Phi-2 model license changed to MIT but UniversalNER is still under research license so this model can be used for research purpose only.

	## Training and evaluation data

	I have used just 5 epochs in fine tuning.

	## Training procedure notebook

	https://github.com/mit1280/fined-tuning/blob/main/phi_2_fine_tune_using_PEFT%2Binference.ipynb

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0002
	- train_batch_size: 2
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- training_steps: 1000

	### Inference Code

	```python
	from peft import PeftModel, PeftConfig
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch
	from transformers import StoppingCriteria

	config = PeftConfig.from_pretrained("Mit1208/phi-2-universal-NER")
	base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2",device_map="auto", trust_remote_code=True)
	model = PeftModel.from_pretrained(base_model, "Mit1208/phi-2-universal-NER", trust_remote_code=True)
	tokenizer = AutoTokenizer.from_pretrained("Mit1208/phi-2-universal-NER", trust_remote_code=True)

	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

	conversations = [ { "from": "human", "value": "Text: Mit Patel here from India"}, {"from": "gpt", "value": "I've read this text."},
	{"from":"human", "value":"what is a name of the person in the text?"}]
	inference_text = tokenizer.apply_chat_template(conversations, tokenize=False) + '<\|im_start\|>gpt:\n'
	inputs = tokenizer(inference_text, return_tensors="pt", return_attention_mask=False).to(device)

	class EosListStoppingCriteria(StoppingCriteria):
	def __init__(self, eos_sequence = tokenizer.encode("<\|im_end\|>")):
	self.eos_sequence = eos_sequence

	def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
	last_ids = input_ids[:,-len(self.eos_sequence):].tolist()
	return self.eos_sequence in last_ids

	outputs = model.generate(**inputs, max_length=512, pad_token_id= tokenizer.eos_token_id,
	stopping_criteria = [EosListStoppingCriteria()])

	text = tokenizer.batch_decode(outputs)[0]

	print(text)

	# Output
	'''
	<\|im_start\|>human
	Text: Mit Patel here from India<\|im_end\|>
	<\|im_start\|>gpt
	I've read this text.<\|im_end\|>
	<\|im_start\|>human
	what is a name of the person in the text?<\|im_end\|>
	<\|im_start\|>gpt:
	["Mit Patel"]<\|im_end\|>
	'''
	```


	### Framework versions

	- PEFT 0.7.1
	- Transformers 4.36.2
	- Pytorch 2.1.0+cu121
	- Datasets 2.15.0
	- Tokenizers 0.15.0