Scandinavian Medical GPT-OSS 20B

A specialized fine-tuned version of OpenAI's GPT-OSS 20B model optimized for natural, fluent medical writing in Swedish, Danish, and Norwegian. This model addresses the need for AI systems that can generate natural sounding Scandinavian medical journal-style language with proper terminology and natural flow.

Model Purpose

This fine-tuned model is specifically designed for healthcare professionals and researchers who need:

  • Natural medical report writing in Scandinavian languages
  • Authentic medical terminology usage across Swedish, Danish, and Norwegian
  • Fluent medical documentation that reads like native medical journal content
  • Cross-linguistic medical knowledge transfer between Scandinavian languages

Fine-tuning Strategy

Training Approach

  • Base Model: OpenAI GPT-OSS 20B (unsloth/gpt-oss-20b)
  • Method: LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning
  • Focus: Medical language fluency and natural writing style in Scandinavian languages
  • Training Paradigm: Supervised Fine-Tuning (SFT) with medical conversation format

Architecture Modifications

  • LoRA Rank (r): 16 - Enhanced for medical domain adaptation
  • LoRA Alpha: 32 - Increased for stronger medical language learning
  • Target Modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
  • Quantization: 4-bit quantization for memory efficiency
  • Trainable Parameters: ~8M out of 20.9B total parameters (0.04%)

Training Data

Dataset composition (1,611 total samples):

  1. Norwegian Colossal Corpus (NCC) - 111 samples

    • Real-world Norwegian text filtered for medical/formal content
    • Natural language patterns from authentic Norwegian sources
    • Focus on formal, academic, and medical terminology
  2. Synthetic Medical Conversations - 1,500 samples

    • Languages: Swedish, Danish, Norwegian (500 samples each)
    • Medical Conditions: Myocardial infarction, post-operative care, diabetes, hypertension, asthma
    • Content Types: Medical assessments, patient reports, treatment recommendations
    • Format: Structured as doctor-patient conversations with proper system prompts

Language-Specific System Prompts:

  • Swedish: "Du är en erfaren läkare som skriver medicinska rapporter och journaler på svenska. Skriv naturligt och flytande med korrekt medicinsk terminologi."
  • Danish: "Du er en erfaren læge, der skriver medicinske rapporter og journaler på dansk. Skriv naturligt og flydende med korrekt medicinsk terminologi."
  • Norwegian: "Du er en erfaren lege som skriver medisinske rapporter og journaler på norsk. Skriv naturlig og flytande med korrekt medisinsk terminologi."

Training Parameters

Training Configuration:
  Epochs: 2
  Total Steps: 202
  Batch Size: 2 per device
  Gradient Accumulation: 8 steps
  Effective Batch Size: 16
  Learning Rate: 1e-4
  Scheduler: Cosine
  Warmup Steps: 50
  Optimizer: AdamW 8-bit
  Weight Decay: 0.01
  Max Sequence Length: 2048 tokens
  
LoRA Configuration:
  Rank: 16
  Alpha: 32
  Dropout: 0.0
  Bias: none
  Use RSLoRA: false

Installation & Setup

Prerequisites

pip install --upgrade uv
uv pip install \
    "torch>=2.8.0" "triton>=3.4.0" numpy torchvision bitsandbytes "transformers>=4.55.3" \
    "unsloth_zoo[base] @ git+https://github.com/unslothai/unsloth-zoo" \
    "unsloth[base] @ git+https://github.com/unslothai/unsloth"

Hardware Requirements

  • GPU: NVIDIA GPU with at least 16GB VRAM (tested on RTX A6000)
  • RAM: 32GB+ recommended
  • Storage: 50GB+ for model and dependencies

Loading the Model

from unsloth import FastLanguageModel
import torch

# Load the fine-tuned model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="VaibhavSxn/scandinavian-medical-gpt-oss-20b",  # Path to your saved model
    max_seq_length=2048,
    dtype=None,  # Auto-detection
    load_in_4bit=True,  # Memory efficient loading
)

print("Scandinavian Medical GPT-OSS model loaded successfully!")

Inference Examples

Swedish Medical Assessment

from transformers import TextStreamer

messages = [
    {
        "role": "system", 
        "content": "Du är en erfaren läkare som skriver medicinska rapporter och journaler på svenska. Skriv naturligt och flytande med korrekt medicinsk terminologi."
    },
    {
        "role": "user", 
        "content": "Skriv en kort medicinsk bedömning för en patient med diabetes typ 2 som behöver justera sin behandling."
    }
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
    return_dict=True,
    reasoning_effort="medium",  # GPT-OSS reasoning control
).to(model.device)

# Generate response
_ = model.generate(
    **inputs, 
    max_new_tokens=300, 
    streamer=TextStreamer(tokenizer),
    temperature=0.7,
    do_sample=True
)

Norwegian Medical Report

messages = [
    {
        "role": "system", 
        "content": "Du er en erfaren lege som skriver medisinske rapporter og journaler på norsk. Skriv naturlig og flytande med korrekt medicinsk terminologi."
    },
    {
        "role": "user", 
        "content": "Skriv en postoperativ rapport for en pasient som har gjennomgått laparoskopisk kolecystektomi."
    }
]

# Same inference pattern as above

Danish Treatment Recommendation

messages = [
    {
        "role": "system", 
        "content": "Du er en erfaren læge, der skriver medicinske rapporter og journaler på dansk. Skriv naturligt og flydende med korrekt medicinsk terminologi."
    },
    {
        "role": "user", 
        "content": "Skriv en behandlingsanbefaling for en patient med hypertension, der ikke responderer på nuværende terapi."
    }
]

# Same inference pattern as above

GPT-OSS Reasoning Effort Control

One unique feature of GPT-OSS is adjustable reasoning effort:

# Available reasoning levels:
reasoning_effort_options = ["low", "medium", "high"]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
    return_dict=True,
    reasoning_effort="high",  # For complex medical reasoning
).to(model.device)
  • Low: Fast responses, minimal reasoning
  • Medium: Balanced performance and speed (recommended for most medical tasks)
  • High: Maximum reasoning capability for complex cases

Model Performance

Training Metrics

  • Peak GPU Memory: ~47.5GB total, ~19.4GB reserved
  • Training Time: ~10-15 minutes on RTX A6000
  • Memory Efficiency: 30% VRAM reduction with Unsloth optimizations
  • Parameter Efficiency: Only 0.04% of parameters trained via LoRA

Be wary of the following

  • This model is for research and educational purposes only
  • Not intended for direct clinical use without human oversight
  • All medical content generated should be reviewed by qualified healthcare professionals
  • The model may generate plausible-sounding but incorrect medical information
  • Always verify medical facts and recommendations through proper medical sources

Citation

If you use this model in your research, please cite:

@misc{scandinavian-medical-gpt-oss-2025,
  title={Scandinavian Medical GPT-OSS: A Fine-tuned Model for Natural Medical Language Generation},
  author={Vaibhav Saxena},
  year={2025},
  howpublished={Fine-tuned from OpenAI GPT-OSS 20B using Unsloth},
}

Related Resources

Contributing

Contributions to improve the model are welcome! Please consider:

  • Adding more diverse medical training data
  • Expanding to additional Scandinavian languages (Icelandic, Faroese)
  • Improving evaluation metrics for medical accuracy
  • Creating domain-specific benchmarks

License

Apache 2.0, go bonkers!


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for VaibhavSxn/scandinavian-medical-gpt-oss-20b

Base model

openai/gpt-oss-20b
Finetuned
(10)
this model

Dataset used to train VaibhavSxn/scandinavian-medical-gpt-oss-20b