Mistral-7B Medical QA Model

A specialized medical question-answering model built on Mistral-7B and fine-tuned on the FreedomIntelligence/medical-o1-reasoning-SFT dataset.

Model Description

This model is a LoRA adaptation of Mistral-7B, fine-tuned to provide accurate and informative answers to medical questions. It's optimized using Unsloth for efficient training and inference.

Usage

To use this model:

!pip install unsloth
from unsloth import FastLanguageModel
import torch

# Define the Alpaca prompt template
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input_text}
### Response:
{output}"""

# Load your model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="Subh775/mistral-7b-medical-o1-ft",
    max_seq_length=2048,
    load_in_4bit=True
)

# Enable optimized inference mode for faster generation
FastLanguageModel.for_inference(model)
# Function to handle the chat loop with memory

def chat():
    print("Chat with mistral-7b-medical-o1-ft! Type '\\q' or 'quit' to stop.\n")

    chat_history = ""  # Store the conversation history

    while True:
        # Get user input
        user_input = input("➤ ")

        # Exit condition
        if user_input.lower() in ['\\q', 'quit']:
            print("\nExiting the chat. Goodbye 🩺👍!")
            print("✨" + "=" * 27 + "✨\n")
            break

        # Append the current input to chat history with instruction formatting
        prompt = alpaca_prompt.format(
            instruction="Please answer the following medical question.",
            input_text=user_input,
            output=""
        )
        chat_history += prompt + "\n"

        # Tokenize combined history and move to GPU
        inputs = tokenizer([chat_history], return_tensors="pt").to("cuda")

        # Generate output with configured parameters
        outputs = model.generate(
            **inputs,
            max_new_tokens=256,
            temperature=0.7,
            top_p=0.9,
            num_return_sequences=1,
            do_sample=True,
            no_repeat_ngram_size=2
        )

        # Decode and clean the model's response
        decoded_output = tokenizer.batch_decode(outputs, skip_special_tokens=True)
        clean_output = decoded_output[0].split('### Response:')[-1].strip()

        # Add the response to chat history
        chat_history += f": {clean_output}\n"

        # Display the response
        print(f"\n🧑‍⚕️: {clean_output}\n")

# Start the chat
chat()

Training

This model was fine-tuned on the FreedomIntelligence/medical-o1-reasoning-SFT dataset, which contains approximately 50,000 high-quality medical question-answer pairs. The training used Unsloth for optimization and LoRA for parameter-efficient fine-tuning.

Key Features

  • Base Model: unsloth/mistral-7b-bnb-4bit
  • Fine-Tuning Objective: Adaptation for structured, step-by-step medical reasoning tasks.
  • Training Dataset: 19,704 samples from medical-o1-reasoning-SFT dataset.
  • Tools Used:
    • Unsloth: Accelerates training by 2x.
    • 4-bit Quantization: Reduces model memory usage.
    • LoRA Adapters: Enables parameter-efficient fine-tuning.
  • Training Time: 38 minutes, 57 seconds for 1 epoch.
  • The step and Training loss for the last iteration are:
  • Step: 60
  • Training Loss: 1.160700

Limitations

  • This model provides general medical information and should not be used as a substitute for professional medical advice.
  • The model's knowledge is limited to its training data and may not include the latest medical research.
  • Not clinically validated and should not be used for diagnosis or treatment decisions.

License

This model inherits the license from the base Mistral-7B model.

Citations

@misc{mistral-7b-medical-o1-ft,
  author = {Subh775},
  title = {Mistral-7B Medical QA Model},
  year = {2025},
  publisher = {HuggingFace},
  journal = {HuggingFace Repository},
  howpublished = {\url{https://huggingface.co/Subh775/mistral-7b-medical-o1-ft}}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Model tree for Subh775/mistral-7b-medical-o1-ft

Finetuned
(519)
this model

Dataset used to train Subh775/mistral-7b-medical-o1-ft

Collection including Subh775/mistral-7b-medical-o1-ft