Llama-3.1-8B-Health-21Level-Complexity

A fine-tuned 8B-parameter Llama 3.1 model that generates medical answers at precise complexity levels, using control codes ranging from (very simple) to (highly technical).


🧠 Overview

  • Foundation model: meta-llama/Meta-Llama-3.1-8B-Instruct
  • Checkpoint used for training: unsloth/meta-llama-3.1-8b-instruct-bnb-4bit (4-bit, LoRA-ready)
  • Architecture: Llama-3.1 + LoRA adapter with control tokens
  • Input: Medical question, optionally prefixed with a <COMPLEXITY_XX> token
  • Output: A tailored medical answer adapted to the desired complexity
  • Complexity Control: 21 levels (0, 5, 10, ..., 100)

🎯 Use Cases

  • Patient education – Adjust responses for different health literacy levels
  • Medical training – Tailor explanations for students, nurses, or professionals
  • Conversational agents – Dynamically adapt to user needs in chatbots
  • Health content creation – Generate multiple versions of the same answer for varied audiences

βš™οΈ Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig 
from peft import PeftModel
import torch

# 1. Load tokenizer that contains the extra control tokens
tokenizer = AutoTokenizer.from_pretrained("DNivalis/Llama-3.1-8B-Health-21Level-Complexity")

# 2. Load base model (4-bit, ~7 GB VRAM)
bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)
base = AutoModelForCausalLM.from_pretrained(
    "unsloth/meta-llama-3.1-8b-instruct-bnb-4bit",
    quantization_config=bnb,
    device_map="auto"
)

# 3. Resize tokenizer to include new tokens
base.resize_token_embeddings(len(tokenizer))

# 4. Load the LoRA adapter
model = PeftModel.from_pretrained(base, "DNivalis/Llama-3.1-8B-Health-21Level-Complexity")

# 5. Helper function
def ask(question, level=None):
    prompt = f"<COMPLEXITY_{level}> {question}" if level else question
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    out = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.7, pad_token_id=tokenizer.eos_token_id)
    answer = tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
    return answer.strip()

# 6. Go!
print(ask("What is asthma?", level=10))
print(ask("What is asthma?", level=50))
print(ask("What is asthma?", level=90))
print(ask("What is asthma?"))  # No complexity control

πŸ“Š Complexity Control

  • Trained on 184,843 question-answer pairs rewritten at 21 levels of complexity

  • Levels derived from a data-driven scoring formula based on 13 linguistic features

  • Control codes: <COMPLEXITY_0> through <COMPLEXITY_100>, every 5 points

  • Formula incorporates:

    • Traditional readability
    • Medical jargon metrics
    • Syntax and cohesion
    • Expert evaluation from multiple LLMs

πŸ“š Training Data

  • Multi-source QA from:

    • LiveQA, MedicationQA, MediQA-AnS
    • MedQuAD, BioASQ Task 13B
  • Each question paired with 5 rewritten answers for different education levels

  • All variants scored and categorized into 21 distinct complexity levels


πŸ”§ Fine-Tuning Details

  • Base model: meta-llama/Meta-Llama-3.1-8B-Instruct
  • Training checkpoint: unsloth/meta-llama-3.1-8b-instruct-bnb-4bit
  • PEFT method: LoRA (rank=8, alpha=16, lr=5e-5)
  • Control method: Learned <COMPLEXITY_XX> tokens initialized semantically
  • Batching strategy: Context-aware (answers to the same question grouped)

πŸ“– Citation

If you use this model or the associated datasets, please cite:

...


πŸ“ License & Usage

Licensed under Apache 2.0.

  • βœ… Permitted: research, commercial use, redistribution, derivative works
  • πŸ”— Include license notice and attribution if redistributed

⚠️ Notes

  • Model does not replace medical professionals
  • Generated content is for educational or assistive use
  • Some output drift may occur at very high complexity levels
  • Consider content verification for deployment in critical systems

πŸ“¬ Contact

For issues, feedback, or collaboration requests, open an issue on the model repository.

Downloads last month
11
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for DNivalis/Llama-3.1-8B-Health-21Level-Complexity

Adapter
(1014)
this model