Sybil - Lung Cancer Risk Prediction

🎯 Model Description

Sybil is a validated deep learning model that predicts future lung cancer risk from a single low-dose chest CT (LDCT) scan. Published in the Journal of Clinical Oncology, this model can assess cancer risk over a 1-6 year timeframe.

Key Features

  • Single Scan Analysis: Requires only one LDCT scan
  • Multi-Year Prediction: Provides risk scores for years 1-6
  • Validated Performance: Tested across multiple institutions globally
  • Ensemble Approach: Uses 5 models for robust predictions

πŸš€ Quick Start

Installation

pip install huggingface-hub torch torchvision pydicom

Basic Usage

from huggingface_hub import snapshot_download
import sys

# Download model
model_path = snapshot_download(repo_id="Lab-Rasool/sybil")
sys.path.append(model_path)

# Import model
from modeling_sybil_wrapper import SybilHFWrapper
from configuration_sybil import SybilConfig

# Initialize
config = SybilConfig()
model = SybilHFWrapper(config)

# Prepare your DICOM files (CT scan slices)
dicom_paths = ["scan1.dcm", "scan2.dcm", ...]  # Replace with actual paths

# Get predictions
output = model(dicom_paths=dicom_paths)
risk_scores = output.risk_scores.numpy()

# Display results
print("Lung Cancer Risk Predictions:")
for i, score in enumerate(risk_scores):
    print(f"Year {i+1}: {score*100:.1f}%")

πŸ“Š Example with Demo Data

import requests
import zipfile
from io import BytesIO
import os

# Download demo DICOM files
def get_demo_data():
    cache_dir = os.path.expanduser("~/.sybil_demo")
    demo_dir = os.path.join(cache_dir, "sybil_demo_data")

    if not os.path.exists(demo_dir):
        print("Downloading demo data...")
        url = "https://www.dropbox.com/scl/fi/covbvo6f547kak4em3cjd/sybil_example.zip?rlkey=7a13nhlc9uwga9x7pmtk1cf1c&dl=1"
        response = requests.get(url)

        os.makedirs(cache_dir, exist_ok=True)
        with zipfile.ZipFile(BytesIO(response.content)) as zf:
            zf.extractall(cache_dir)

    # Find DICOM files
    dicom_files = []
    for root, dirs, files in os.walk(cache_dir):
        for file in files:
            if file.endswith('.dcm'):
                dicom_files.append(os.path.join(root, file))

    return sorted(dicom_files)

# Run demo
from huggingface_hub import snapshot_download
import sys

# Load model
model_path = snapshot_download(repo_id="Lab-Rasool/sybil")
sys.path.append(model_path)

from modeling_sybil_wrapper import SybilHFWrapper
from configuration_sybil import SybilConfig

# Initialize and predict
config = SybilConfig()
model = SybilHFWrapper(config)

dicom_files = get_demo_data()
output = model(dicom_paths=dicom_files)

# Show results
for i, score in enumerate(output.risk_scores.numpy()):
    print(f"Year {i+1}: {score*100:.1f}% risk")

Expected output for demo data:

Year 1: 2.2% risk
Year 2: 4.5% risk
Year 3: 7.2% risk
Year 4: 7.9% risk
Year 5: 9.6% risk
Year 6: 13.6% risk

πŸ“ˆ Performance Metrics

Dataset 1-Year AUC 6-Year AUC Sample Size
NLST Test 0.94 0.86 ~15,000
MGH 0.86 0.75 ~12,000
CGMH Taiwan 0.94 0.80 ~8,000

πŸ₯ Intended Use

Primary Use Cases

  • Risk stratification in lung cancer screening programs
  • Research on lung cancer prediction models
  • Clinical decision support (with appropriate oversight)

Users

  • Healthcare providers
  • Medical researchers
  • Screening program coordinators

Out of Scope

  • ❌ Diagnosis of existing cancer
  • ❌ Use with non-LDCT imaging (X-rays, MRI)
  • ❌ Sole basis for clinical decisions
  • ❌ Use outside medical supervision

πŸ“‹ Input Requirements

  • Format: DICOM files from chest CT scan
  • Type: Low-dose CT (LDCT)
  • Orientation: Axial view
  • Order: Anatomically ordered (abdomen β†’ clavicles)
  • Number of slices: Typically 100-300 slices
  • Resolution: Automatically handled by model

⚠️ Important Considerations

Medical AI Notice

This model should supplement, not replace, clinical judgment. Always consider:

  • Complete patient medical history
  • Additional risk factors (smoking, family history)
  • Current clinical guidelines
  • Need for professional medical oversight

Limitations

  • Optimized for screening population (ages 55-80)
  • Best performance with LDCT scans
  • Not validated for pediatric use
  • Performance may vary with different scanner manufacturers

πŸ“š Citation

If you use this model, please cite the original paper:

@article{mikhael2023sybil,
  title={Sybil: a validated deep learning model to predict future lung cancer risk from a single low-dose chest computed tomography},
  author={Mikhael, Peter G and Wohlwend, Jeremy and Yala, Adam and others},
  journal={Journal of Clinical Oncology},
  volume={41},
  number={12},
  pages={2191--2200},
  year={2023},
  publisher={American Society of Clinical Oncology}
}

πŸ™ Acknowledgments

This Hugging Face implementation is based on the original work by:

  • Original Authors: Peter G. Mikhael & Jeremy Wohlwend
  • Institutions: MIT CSAIL & Massachusetts General Hospital
  • Original Repository: GitHub
  • Paper: Journal of Clinical Oncology

πŸ“„ License

MIT License - See LICENSE file

  • Original Model Β© 2022 Peter Mikhael & Jeremy Wohlwend
  • HF Adaptation Β© 2024 Lab-Rasool

πŸ”§ Troubleshooting

Common Issues

  1. Import Error: Make sure to append model path to sys.path

    sys.path.append(model_path)
    
  2. Missing Dependencies: Install all requirements

    pip install torch torchvision pydicom sybil huggingface-hub
    
  3. DICOM Loading Error: Ensure DICOM files are valid CT scans

    import pydicom
    dcm = pydicom.dcmread("your_file.dcm")  # Test single file
    
  4. Memory Issues: Model requires ~4GB GPU memory

    import torch
    device = 'cuda' if torch.cuda.is_available() else 'cpu'
    

πŸ“¬ Support

  • HF Model Issues: Open issue on this repository
  • Original Model: GitHub Issues
  • Medical Questions: Consult healthcare professionals

πŸ” Additional Resources


Note: This is a research model. Always consult qualified healthcare professionals for medical decisions.

Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support