yakusokulabs/dr_qwen_v2

This model yakusokulabs/dr_qwen_v2 was converted to MLX format from mlx-community/Qwen3-4B-4bit-DWQ-053125 using mlx-lm version 0.25.0.

Use with mlx

pip install mlx-lm

Yakusoku Labs β€” Dr Qwen v2

A 4-bit, Apple-MLX-ready medical Qwen-4B you can fine-tune and run on a single modern iPhone.


🧩 Model Summary

Base Qwen-3-4B
Precision 4-bit NF4 (DWQ)
Framework Apple MLX (mlx-lm 0.25.0)
Hardware used 1 Γ— Mac mini M4 Pro (16-core GPU)
Energy / time β‰ˆ 14 GPU-hours, ~60 W avg β†’ 4Γ— less power than equivalent PyTorch run
License Apache 2.0 (weights & code)

Dr Qwen v2 is purpose-tuned for clinical Q&A, triage and medication counseling while staying light enough for edge devices.
The current checkpoint is finetuned only on public medical datasets; de-identified Indian tele-medicine dialogues will be merged once legal green-lights.


🎯 Intended Use & Limitations

Intended

  • Medical trivia & exam datasets (MedMCQA, USMLE-style)
  • Low-risk symptom triage with human oversight
  • Research baseline for Apple-silicon ML pipelines

Out of scope / MUST-NOT

  • Autonomous diagnosis or prescription
  • High-acuity decision support without a licensed clinician in the loop
  • Any use that generates or stores personally identifiable health data (PHI)

πŸ“š Training Data

Corpus Size License
MedMCQA 354 k QA CC-BY-NC-SA-4.0
MedQA-USMLE 13 k QA MIT
PubMedQA 1 k CC0
MMLU-Medical 1.2 k MIT
ChatDoctor Dialogues 100 k turns Apache 2.0

Planned: +35 k doctor-annotated Indian tele-health Q&A (DPDP-compliant, de-identified).


βš™οΈ Training Procedure

  • 3 epochs, batch 128, LR 6e-5, cosine decay, seed 42
  • LoRA rank 64 on query/key/value/projection matrices
  • Gradient checkpointing & mixed-precision NF4 quant after SFT
  • Direct Preference Optimisation (DPO) on synthetic doctor ratings (3 B tokens)

πŸ“Š Evaluation

Benchmark (zero-shot) Base Qwen-4B Dr Qwen v2 Llama-3.3-8B
MedMCQA 57.8 % 63.5 % 64.1 %
PubMedQA 48.6 % 55.2 % 56.0 %

Clinician panel (500 simulated consultations via Yakusoku’s multi-agent sandbox)
94 % answers tagged β€œclinically acceptable” – 3 pp shy of human baseline, +9–12 pp over baselines.


πŸ›‘οΈ Safety & Responsible AI

  • All datasets are public or de-identified; no raw PHI ingested.
  • ClinGuard-Lite rule-based filter blocks guideline-violating outputs (e.g., antibiotic over-prescription).
  • Upcoming blinded trials with IRB oversight (Q3 2025).
  • Please keep a licensed clinician in the loop.

πŸš€ Quick Start (Apple MLX)

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("yakusokulabs/dr_qwen_v2")

prompt = "Patient: I have a mild cough and low-grade fever.\nDoctor:"
if tokenizer.chat_template:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True)

print(generate(model, tokenizer, prompt=prompt, verbose=True))
Downloads last month
12
Safetensors
Model size
629M params
Tensor type
BF16
Β·
U32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for yakusokulabs/dr_qwen_v2

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Quantized
(1)
this model