README.md · segestic/phi2_medical

metadata

base_model: microsoft/phi-2
datasets:
  - medalpaca/medical_meadow_health_advice
  - medalpaca/medical_meadow_mediqa
  - medalpaca/medical_meadow_mmmlu
  - medalpaca/medical_meadow_medical_flashcards
  - medalpaca/medical_meadow_wikidoc_patient_information
  - medalpaca/medical_meadow_wikidoc
  - medalpaca/medical_meadow_pubmed_causal
  - medalpaca/medical_meadow_medqa
  - medalpaca/medical_meadow_cord19
language:
  - en
license: mit
license_link: https://huggingface.co/microsoft/phi-2/resolve/main/LICENSE
pipeline_tag: text-generation
tags:
  - nlp
  - Medicine

Model Summary

Phi2_med_seg is a fine-tuned version of the Phi-2 model, specifically optimized for medical applications. This model has been trained using the Trainer framework on several different datasets from the MedAlpaca collection, which focuses on medical question answering and conversational AI. This model can answer information about different excplicit ideas in medicine

How to Get Started with the Model

Sample Code

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

base_model_id = "microsoft/phi-2"
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,  # Phi2, same as before
    device_map="auto",
    trust_remote_code=True,
    load_in_8bit=True,
    torch_dtype=torch.float16,
)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
eval_tokenizer = AutoTokenizer.from_pretrained(base_model_id, add_bos_token=True, trust_remote_code=True, use_fast=False)
eval_tokenizer.pad_token = tokenizer.eos_token

from peft import PeftModel

adapter_model_id = "segestic/phi2_medical_seg"
ft_model = PeftModel.from_pretrained(base_model, adapter_model_id)

eval_prompt = "What is medicine?"
model_input = eval_tokenizer(eval_prompt, return_tensors="pt").to("cuda")

ft_model.eval()
with torch.no_grad():
    print(eval_tokenizer.decode(ft_model.generate(**model_input, max_new_tokens=100, repetition_penalty=1.11)[0], skip_special_tokens=True))

Training

The fine-tuning process involved leveraging various medical datasets to enhance the model's ability to understand and generate relevant medical information. This approach aims to improve the model's performance in medical contexts, making it a valuable tool for healthcare professionals and researchers alike. By utilizing the Trainer framework, Phi2_med_seg benefits from advanced training techniques that help refine its responses and accuracy in medical scenarios.

Model

Architecture: a Transformer-based model with next-word prediction objective
Context length: 2048 tokens

segestic
/

phi2_medical_seg

Model Summary

How to Get Started with the Model

Sample Code

Training

Model

Software