--- title: BioMedNorm MCP Server sdk: gradio sdk_version: 5.33.0 app_file: app.py pinned: true license: apache-2.0 python_version: 3.13.3 short_description: 'Extract and normalize entities from biomedical text.' tags: - mcp-server-track --- A MCP server for extracting and normalizing domain-specific entities from biomedical text. We utilize LLMs (here, from OpenAI) to identify entities and map them to standardized terminology or ontologies. ## Motivation Approximately 80% of electronic health record (EHR) data exists as unstructured medical text. Such text often contains abbreviations, misspellings, and non-standardized terminology, creating barriers to effective data utilization. This variability hinders progress in areas such as: - **Clinical decision support** at the point of care - **Patient comprehension** of their own medical records - **Biomedical research** including cohort identification and pharmacovigilance By implementing named entity recognition (NER) and normalization to standardized vocabularies (ontologies), our MCP server facilitates the extraction of structured data from diverse biomedical texts. This enhances the accuracy and efficiency of applications in clinical settings and research, bridging the gap between complex natural language and the structured data requirements of modern biomedical systems. ## Installation This project uses `uv` from Astral for dependency management. Make sure you have `uv` installed. If not, install it with the following command: ```bash curl -LsSf https://astral.sh/uv/install.sh | sh ``` Follow these steps to set up the project: ### Clone the repository ```bash git clone https://huggingface.co/spaces/Agents-MCP-Hackathon/BioMedNorm-MCP-Server ``` ### Set up Python environment The project includes a .python-version file to specify the required Python version. Ensure you are using the correct version by setting it up with your Python environment manager (e.g., `pyenv`). ### Install dependencies The project dependencies are defined in the `pyproject.toml` file. To install them, run: ```bash uv pip install -e . ``` ### Set up environment variables The project **requires** an OpenAI API key, which should be stored in a `.env` file. ## Running the application Run the application using `uv run`: ```bash uv run app.py ``` This command ensures that: - All project dependencies are correctly installed - The environment variables from `.env` are loaded - The application runs in the proper environment After starting the server, you can access: - Web interface: `http://your-server:port` - MCP endpoint: `http://your-server:port/gradio_api/mcp/sse` ## Using the Web Interface - Enter text in the input area - Select the entity type (Disease, Tissue, or Cell Type) - Click "Normalize" - View the normalized entities in the results area ## Using as an MCP Tool The server exposes an MCP-compatible endpoint that can be used by AI agents. The tool accepts: - `paragraph`: Text to extract entities from - `target_entity`: Type of entity to extract ("Disease", "Tissue", or "Cell Type") and returns a list of normalized entities. ### Video Here is a [video link](./MCPServer_test.webm) that shows the MCP server in action with a Gradio-based MCP client: ## Future Improvements Our biomedical text normalization MCP server can be enhanced in several ways: - Expanded Entity Coverage: Extend beyond the current entity types (Disease, Tissue, Cell Type) to include other entities such as library processing protocols, cell lines, disease status, to name a few. - User Feedback Loop: Implement a mechanism for users to correct normalization errors, creating a dataset for continuous model improvement. - Multilingual Support: Expand capabilities to handle medical text in languages beyond English.