Llama-3.1-8B-Instruct LoRA Adapter for Medical MCQ JSON Generation
This is a LoRA (Low-Rank Adaptation) adapter for the meta-llama/Llama-3.1-8B-Instruct model, fine-tuned for medical multiple-choice question answering with structured JSON output generation.
Model Details
Base Model
- Model: meta-llama/Llama-3.1-8B-Instruct
- Architecture: Llama 3.1 instruction-tuned language model
- Parameters: 8B parameters
- Precision: BFloat16
LoRA Configuration
- Rank (r): 64
- Alpha: 128
- Dropout: 0.1
- Target Modules: q_proj, up_proj, gate_proj, v_proj, o_proj, k_proj, down_proj
- Task Type: Causal Language Modeling
Training Details
Dataset
- Source: asanchez75/medical_textbooks_mcmq
- Domain: Medical multiple-choice questions
- Language: Primarily French medical content
- Format: JSON-structured input/output pairs
Training Configuration
- Epochs: 3
- Learning Rate: 2e-5
- Batch Size: 4 (per device)
- Gradient Accumulation: 4 steps
- Effective Batch Size: 16
- Sequence Length: 2048 tokens
- Hardware: NVIDIA A100 SXM4 40GB
Performance
- Final Test Loss: 0.493
- Training Time: ~19 minutes
- Memory Usage: 23.4GB peak (A100 40GB)
Usage
Loading the Adapter
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.1-8B-Instruct",
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "YOUR_HF_USERNAME/REPO_NAME")
Inference Example
import json
# Input format
input_data = {
"question_id": 12345,
"question": "Quel est le traitement de première intention de l'hypertension artérielle?",
"option_a": "Inhibiteurs de l'ECA",
"option_b": "Bêta-bloquants",
"option_c": "Diurétiques thiazidiques",
"option_d": "Antagonistes calciques",
"option_e": "Sartans"
}
# Format prompt
system_prompt = "You are a helpful medical assistant. Given a multiple-choice question in JSON format, provide the correct answer options and a detailed explanation in JSON format."
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": json.dumps(input_data, ensure_ascii=False)}
]
# Generate response
formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(formatted_prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.0)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
Expected Output Format
{
"correct_options": "A, C",
"explanation": "Les inhibiteurs de l'ECA et les diurétiques thiazidiques sont recommandés en première intention..."
}
Model Architecture
This adapter targets the following modules in the Llama-3.1-8B-Instruct model:
- Query projection (q_proj)
- Key projection (k_proj)
- Value projection (v_proj)
- Output projection (o_proj)
- Gate projection (gate_proj)
- Up projection (up_proj)
- Down projection (down_proj)
Limitations and Biases
- Domain Specific: Optimized for French medical content
- MCQ Format: Designed for structured multiple-choice questions
- Medical Focus: Performance may vary on non-medical content
- Language: Primarily trained on French medical terminology
- Base Model: Built on instruction-tuned Llama 3.1 architecture
Citation
If you use this model, please cite:
@misc{llama31-8b-mcq-lora,
title={Llama-3.1-8B-Instruct LoRA Adapter for Medical MCQ JSON Generation},
author={Your Name},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/YOUR_USERNAME/REPO_NAME}
}
License
This adapter is released under the Apache-2.0 license, consistent with the base Llama-3.1-8B-Instruct model.
- Downloads last month
- 8
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for asanchez75/Llama3.1-8b-mcq-lora
Base model
meta-llama/Llama-3.1-8B
Finetuned
meta-llama/Llama-3.1-8B-Instruct