Gazal-R1-32B-sft-merged-preview

This is a DoRA adapter fine-tuned on top of Qwen/Qwen3-32B for specialized medical reasoning tasks.

Model description

This adapter was trained using PEFT/LoRA to enhance the base model's ability to perform step-by-step clinical reasoning and medical problem-solving.

Training data

The model was fine-tuned on a synthetic, structured reasoning dataset, which contains medical questions with step-by-step reasoning and final answers.

Training procedure

The model was trained using:

LoRA with rank 256
DoRA (Weight-Decomposed Low-Rank Adaptation)
rsLoRA (Rank-stabilized LoRA)
BF16 precision training

Use cases and limitations

This model is intended for medical education and clinical reasoning training. It should NOT be used for actual medical diagnosis or treatment decisions. Always consult qualified healthcare professionals for medical advice.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model and tokenizer
base_model_id = "Qwen/Qwen3-32B"
adapter_id = "TachyHealth/Gazal-R1-32B-sft-merged"

# Load the tokenizer and base model
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype="auto",
    device_map="auto",
)

# Load the LoRA adapter
model = PeftModel.from_pretrained(model, adapter_id)

# Prepare a prompt following the format during training
query = """[MEDICAL QUESTION]"""

messages = [
    {"role": "system", "content": "When solving complex medical problems, follow this specific format..."},
    {"role": "user", "content": query}
]

input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)

# Generate response
outputs = model.generate(
    input_ids=inputs.input_ids,
    max_new_tokens=2048,
    temperature=0.6,
    do_sample=True,
)

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Performance Results

Gazal-R1 achieves exceptional performance across standard medical benchmarks:

Model	Size	MMLU Pro (Medical)	MedMCQA	MedQA	PubMedQA
Gazal-R1 (Final)	32B	81.6	71.9	87.1	79.6
Gazal-R1 (SFT-only)	32B	79.3	72.3	86.9	77.6
Llama 3.1 405B Instruct	405B	70.2	75.8	81.9	74.6
Qwen 2.5 72B Instruct	72B	72.1	66.2	72.7	71.7
Med42-Llama3.1-70B	70B	66.1	72.4	80.4	77.6
Llama 3.1 70B Instruct	70B	74.5	72.5	78.4	78.5
QwQ 32B	32B	70.1	65.6	72.3	73.7
Qwen 3 32B	32B	78.4	71.6	84.4	76.7

TachyHealth
/

Gazal-R1-32B-sft-merged-preview

Gazal-R1-32B-sft-merged-preview

Model description

Training data

Training procedure

Use cases and limitations

Usage

Performance Results

Model tree for TachyHealth/Gazal-R1-32B-sft-merged-preview

Dataset used to train TachyHealth/Gazal-R1-32B-sft-merged-preview

Collection including TachyHealth/Gazal-R1-32B-sft-merged-preview

Gazal-R1