Mistral-7B Medical QA Model

A specialized medical question-answering model built on Mistral-7B and fine-tuned on the FreedomIntelligence/medical-o1-reasoning-SFT dataset.

Model Description

This model is a LoRA adaptation of Mistral-7B, fine-tuned to provide accurate and informative answers to medical questions. It's optimized using Unsloth for efficient training and inference.

Inference Instructions

To use this model:

!pip install unsloth

from unsloth import FastLanguageModel
import torch

# Define the Alpaca prompt template
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input_text}
### Response:
{output}"""

# Load your model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="Subh775/mistral-7b-medical-o1-ft",
    max_seq_length=2048,
    load_in_4bit=True
)

# Enable optimized inference mode for faster generation
FastLanguageModel.for_inference(model)

# Function to handle the chat loop with memory

def chat():
    print("Chat with mistral-7b-medical-o1-ft! Type '\\q' or 'quit' to stop.\n")

    chat_history = ""  # Store the conversation history

    while True:
        # Get user input
        user_input = input("➤ ")

        # Exit condition
        if user_input.lower() in ['\\q', 'quit']:
            print("\nExiting the chat. Goodbye 🩺👍!")
            print("✨" + "=" * 27 + "✨\n")
            break

        # Append the current input to chat history with instruction formatting
        prompt = alpaca_prompt.format(
            instruction="Please answer the following medical question.",
            input_text=user_input,
            output=""
        )
        chat_history += prompt + "\n"

        # Tokenize combined history and move to GPU
        inputs = tokenizer([chat_history], return_tensors="pt").to("cuda")

        # Generate output with configured parameters
        outputs = model.generate(
            **inputs,
            max_new_tokens=256,
            temperature=0.7,
            top_p=0.9,
            num_return_sequences=1,
            do_sample=True,
            no_repeat_ngram_size=2
        )

        # Decode and clean the model's response
        decoded_output = tokenizer.batch_decode(outputs, skip_special_tokens=True)
        clean_output = decoded_output[0].split('### Response:')[-1].strip()

        # Add the response to chat history
        chat_history += f": {clean_output}\n"

        # Display the response
        print(f"\n🧑‍⚕️: {clean_output}\n")

# Start the chat
chat()

Training

This model was fine-tuned on the FreedomIntelligence/medical-o1-reasoning-SFT dataset, which contains approximately 50,000 high-quality medical question-answer pairs. The training used Unsloth for optimization and LoRA for parameter-efficient fine-tuning.

Key Features

Base Model: unsloth/mistral-7b-bnb-4bit
Fine-Tuning Objective: Adaptation for structured, step-by-step medical reasoning tasks.
Training Dataset: 19,704 samples from medical-o1-reasoning-SFT dataset.
Tools Used:
- Unsloth: Accelerates training by 2x.
- 4-bit Quantization: Reduces model memory usage.
- LoRA Adapters: Enables parameter-efficient fine-tuning.
Training Time: 38 minutes, 57 seconds for 1 epoch.
The step and Training loss for the last iteration are:
Step: 60
Training Loss: 1.160700

Limitations

This model provides general medical information and should not be used as a substitute for professional medical advice.
The model's knowledge is limited to its training data and may not include the latest medical research.
Not clinically validated and should not be used for diagnosis or treatment decisions.

License

This model inherits the license from the base Mistral-7B model.

Citations

@misc{mistral-7b-medical-o1-ft,
  author = {Subh775},
  title = {Mistral-7B Medical QA Model},
  year = {2025},
  publisher = {HuggingFace},
  journal = {HuggingFace Repository},
  howpublished = {\url{https://huggingface.co/Subh775/mistral-7b-medical-o1-ft}}
}

Subh775
/

mistral-7b-medical-o1-ft

Mistral-7B Medical QA Model

Model Description

Inference Instructions

Training

Key Features

Limitations

License

Citations

Model tree for Subh775/mistral-7b-medical-o1-ft

Dataset used to train Subh775/mistral-7b-medical-o1-ft

Collection including Subh775/mistral-7b-medical-o1-ft

LoRA Checkpoints