sapbert-pubmedbert-ddxplus-10k / README.md

Update README.md

78fe38c verified 5 months ago

4.04 kB

	---
	library_name: transformers
	license: mit
	datasets:
	- aai530-group6/ddxplus
	language:
	- en
	metrics:
	- precision
	- recall
	- f1
	base_model:
	- cambridgeltl/SapBERT-from-PubMedBERT-fulltext
	tags:
	- medical-diagnosis
	- sapbert
	- ddxplus
	- pubmedbert
	- disease-classification
	- differential-diagnosis
	---


	## Model Details

	### Model Description

	This model is a fine-tuned version of cambridgeltl/SapBERT-from-PubMedBERT-fulltext on the DDXPlus dataset (10,000 samples) for medical diagnosis tasks.

	This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

	- Developed by: [Aashish Acharya](https://github.com/acharya-jyu)
	- Model type: sapBERT-BioMedBERT
	- Language(s): English
	- License: MIT
	- Finetuned from model: cambridgeltl/SapBERT-from-PubMedBERT-fulltext

	### Model Sources

	- Repository: [cambridgeltl/SapBERT-from-PubMedBERT-fulltext](https://huggingface.co/cambridgeltl/SapBERT-from-PubMedBERT-fulltext)
	- Dataset: [aai530-group6/ddxplus](https://huggingface.co/aai530-group6/ddxplus)

	## Training Dataset
	The model was trained on DDXPlus dataset (10,000 samples) containing:
	- Patient cases with comprehensive medical information
	- Differential diagnosis annotations
	- 49 distinct medical conditions
	- Evidence-based symptom-condition relationships
	## Performance
	### Final Metrics
	- Test Precision: 0.9619
	- Test Recall: 0.9610
	- Test F1 Score: 0.9592
	### Training Evolution
	- Best Validation F1: 0.9728 (Epoch 4)
	- Final Validation Loss: 0.6352

	<img src="https://cdn-uploads.huggingface.co/production/uploads/662757230601587f0be9781b/7GK4e9jy4vKz9gSXU-dbh.png" width="400" alt="image">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/662757230601587f0be9781b/5b_O5oX0BISljP1kwdtTN.png" width="400" alt="image">


	## Intended Use
	This model is designed for:
	- Medical diagnosis support
	- Symptom analysis
	- Disease classification
	- Differential diagnosis generation

	## Out-of-Scope Use
	The model should NOT be used for:

	- Direct medical diagnosis without professional oversight
	- Critical healthcare decisions without human validation
	- Clinical applications without proper testing and validation


	## Training Details
	### Training Procedure

	- Optimizer: AdamW with weight decay (0.01)
	- Learning Rate: 1e-5
	- Loss Function: Combined loss (0.8 × Focal Loss + 0.2 × KL Divergence)
	- Batch Size: 32
	- Gradient Clipping: 1.0
	- Early Stopping: Patience of 3 epochs
	- Training Strategy: Cross-validation with 5 folds


	### Model Architecture

	- Base Model: cambridgeltl/SapBERT-from-PubMedBERT-fulltext
	- Hidden Size: 768
	- Attention Heads: 12
	- Dropout Rate: 0.5
	- Added classification layers for diagnostic tasks
	- Layer normalization and dropout for regularization

	## Example Usage
	<pre>
	from transformers import AutoTokenizer, AutoModel

	# Load model and tokenizer
	model_name = "acharya-jyu/sapbert-pubmedbert-ddxplus-10k"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModel.from_pretrained(model_name)

	# Example input structure
	input_data = {
	'age': 45, # Patient age
	'sex': 'M', # Patient sex: 'M' or 'F'
	'initial_evidence': 'E_91', # Initial evidence code (e.g., E_91 for fever)
	'evidences': [
	'E_91', # Fever
	'E_77', # Cough
	'E_89' # Fatigue
	]
	}

	# Process demographic data and evidence codes
	outputs = model(**input_data)

	# Outputs will include:
	# - Main diagnosis prediction
	# - Differential diagnosis probabilities
	# - Confidence scores
	</pre>
	<b>Note: Evidence codes (E_XX) correspond to specific symptoms and conditions defined in the release_evidences.json file. The model expects these standardized codes rather than raw text input.</b>

	## Citation
	```bibtex
	@misc{acharya2024sapbert,
	title={SapBERT-PubMedBERT Fine-tuned on DDXPlus Dataset},
	author={Acharya, Aashish},
	year={2024},
	publisher={Hugging Face Model Hub}
	}
	```

	## Model Card Contact
	[Aashish Acharya](https://github.com/acharya-jyu)