Spaces:
Running
on
CPU Upgrade
title: RadExtract
emoji: ποΈ
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
license: apache-2.0
header: mini
app_port: 7870
tags:
- medical
- nlp
- radiology
- langextract
- gemini
- structured-data
RadExtract: Radiology Report Structuring Demo
A demonstration application powered by LangExtract that structures radiology reports using Gemini models. Transform unstructured radiology text into organized, interactive segments with clinical significance annotations.
Try the Demo
Transform unstructured radiology reports into structured data with highlighted findings that are precisely mapped back to the original source text.
Key Features
- Structured Output: Organizes reports into anatomical sections with clinical significance
- Interactive Highlighting: Click any finding to see its exact source in the original text
- Clinical Significance: Annotates findings as minor, significant, or grounding
- Character-Level Mapping: Precise attribution back to source text
- Multi-Model Support: Gemini 2.5 Flash (fast) and Pro (comprehensive)
Quick Start
Setup
git clone https://huggingface.co/spaces/google/radextract
cd radextract
python -m venv venv
source venv/bin/activate
pip install -e ".[dev]"
cp env.list.example env.list
# Edit env.list and set KEY=your_gemini_api_key_here
Local Development
source venv/bin/activate
export KEY=your_gemini_api_key_here
python app.py
Access at: http://localhost:7870
API Usage
Example Request
curl -X POST \
-H 'X-Model-ID: gemini-2.5-flash' \
-H 'X-Use-Cache: true' \
-d 'FINDINGS: Normal heart and lungs. IMPRESSION: Normal study.' \
http://localhost:7870/predict
Response Format
{
"segments": [{
"type": "body",
"label": "Chest",
"content": "Normal heart and lungs",
"intervals": [{"startPos": 10, "endPos": 32}],
"significance": "minor"
}],
"text": "Chest:\n- Normal heart and lungs",
"annotated_document_json": {...}
}
Architecture
- Backend: Flask + Python 3.10+ with full type safety
- NLP Engine: LangExtract for structured extraction
- AI Models: Google Gemini 2.5 (Flash/Pro)
- Frontend: Vanilla JavaScript with interactive UI
- Deployment: Docker + Hugging Face Spaces
- Package Details: See pyproject.toml for dependencies, metadata, and tooling
Project Structure
radextract/
βββ app.py # Flask API endpoints
βββ structure_report.py # Core structuring logic
βββ sanitize.py # Text preprocessing & normalization
βββ prompt_instruction.py # LangExtract prompt
βββ cache_manager.py # Response caching
βββ static/ # Frontend assets
βββ templates/ # HTML templates
Development
Setup
git clone https://huggingface.co/spaces/google/radextract
cd radextract
python -m venv venv
source venv/bin/activate
pip install -e ".[dev]"
Code Quality
# Format code
pyink . && isort .
# Type checking
mypy . --ignore-missing-imports
# Run tests
pytest
Docker
# Build and run
docker build -t radextract .
docker run -p 7870:7870 --env-file env.list radextract
License
Apache License 2.0 - see LICENSE for details.
Related Projects
- LangExtract: Core NLP library
Built for the medical AI community | Hosted on Hugging Face Spaces
Disclaimer
This is not an officially supported Google product. If you use RadExtract or LangExtract in production or publications, please cite accordingly and acknowledge usage. Use is subject to the Apache 2.0 License. For health-related applications, use of LangExtract is also subject to the Health AI Developer Foundations Terms of Use.