RohanKarthikeyan's picture
Upload updated code
855e188 verified

A newer version of the Gradio SDK is available: 5.34.2

Upgrade
metadata
title: BioMedNorm MCP Server
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
pinned: true
license: apache-2.0
python_version: 3.13.3
short_description: Extract and normalize entities from biomedical text.
tags:
  - mcp-server-track

A MCP server for extracting and normalizing domain-specific entities from biomedical text. We utilize LLMs (here, from OpenAI) to identify entities and map them to standardized terminology or ontologies.

Motivation

Approximately 80% of electronic health record (EHR) data exists as unstructured medical text. Such text often contains abbreviations, misspellings, and non-standardized terminology, creating barriers to effective data utilization. This variability hinders progress in areas such as:

  • Clinical decision support at the point of care
  • Patient comprehension of their own medical records
  • Biomedical research including cohort identification and pharmacovigilance

By implementing named entity recognition (NER) and normalization to standardized vocabularies (ontologies), our MCP server facilitates the extraction of structured data from diverse biomedical texts. This enhances the accuracy and efficiency of applications in clinical settings and research, bridging the gap between complex natural language and the structured data requirements of modern biomedical systems.

Installation

This project uses uv from Astral for dependency management. Make sure you have uv installed. If not, install it with the following command:

curl -LsSf https://astral.sh/uv/install.sh | sh

Follow these steps to set up the project:

Clone the repository

git clone https://huggingface.co/spaces/Agents-MCP-Hackathon/BioMedNorm-MCP-Server

Set up Python environment

The project includes a .python-version file to specify the required Python version. Ensure you are using the correct version by setting it up with your Python environment manager (e.g., pyenv).

Install dependencies

The project dependencies are defined in the pyproject.toml file. To install them, run:

uv pip install -e .

Set up environment variables

The project requires an OpenAI API key, which should be stored in a .env file.

Running the application

Run the application using uv run:

uv run app.py

This command ensures that:

  • All project dependencies are correctly installed
  • The environment variables from .env are loaded
  • The application runs in the proper environment

After starting the server, you can access:

  • Web interface: http://your-server:port
  • MCP endpoint: http://your-server:port/gradio_api/mcp/sse

Using the Web Interface

  • Enter text in the input area
  • Select the entity type (Disease, Tissue, or Cell Type)
  • Click "Normalize"
  • View the normalized entities in the results area

Using as an MCP Tool

The server exposes an MCP-compatible endpoint that can be used by AI agents. The tool accepts:

  • paragraph: Text to extract entities from
  • target_entity: Type of entity to extract ("Disease", "Tissue", or "Cell Type")

and returns a list of normalized entities.

Video

Here is a video link that shows the MCP server in action with a Gradio-based MCP client:

Future Improvements

Our biomedical text normalization MCP server can be enhanced in several ways:

  • Expanded Entity Coverage: Extend beyond the current entity types (Disease, Tissue, Cell Type) to include other entities such as library processing protocols, cell lines, disease status, to name a few.
  • User Feedback Loop: Implement a mechanism for users to correct normalization errors, creating a dataset for continuous model improvement.
  • Multilingual Support: Expand capabilities to handle medical text in languages beyond English.