How to Install Microsoft Markitdown MCP Server: A Comprehensive Guide
Microsoft Markitdown is a powerful Python utility designed to convert various file formats to Markdown for use with Large Language Models (LLMs) and text analysis pipelines. It recently introduced an MCP (Model Context Protocol) server that enhances its integration capabilities with LLM applications like Claude Desktop. This article provides a detailed guide on how to install and configure the Microsoft Markitdown MCP Server for optimal performance.
Tired of Postman? Want a decent postman alternative that doesn't suck?
Apidog is a powerful all-in-one API development platform that's revolutionizing how developers design, test, and document their APIs.
Unlike traditional tools like Postman, Apidog seamlessly integrates API design, automated testing, mock servers, and documentation into a single cohesive workflow. With its intuitive interface, collaborative features, and comprehensive toolset, Apidog eliminates the need to juggle multiple applications during your API development process.
Whether you're a solo developer or part of a large team, Apidog streamlines your workflow, increases productivity, and ensures consistent API quality across your projects.
What is Microsoft Markitdown?
Before diving into the installation process, it's important to understand what Microsoft Markitdown is and why it's useful. Markitdown is a lightweight Python utility that converts various file formats into Markdown while preserving important document structure and content elements such as headings, lists, tables, and links. It supports a wide range of file formats, including:
- PDF documents
- Microsoft Office files (PowerPoint, Word, Excel)
- Images (with EXIF metadata and OCR capabilities)
- Audio files (with EXIF metadata and speech transcription)
- HTML files
- Text-based formats (CSV, JSON, XML)
- ZIP files
- YouTube URLs
- EPubs
- And many more
The primary advantage of using Markdown is that it strikes a perfect balance between plain text and formatted content, making it ideal for LLMs that have been trained extensively on Markdown-formatted text. This approach is particularly token-efficient, optimizing both processing speed and accuracy.
Understanding the Markitdown MCP Server
The Markitdown-MCP package provides a lightweight STDIO (Standard Input/Output) and SSE (Server-Sent Events) MCP server for calling Markitdown's functionality. It exposes one primary tool: convert_to_markdown(uri)
, which can process any HTTP, HTTPS, file, or data URI. This server allows for seamless integration with LLM applications like Claude Desktop, enhancing document processing workflows.
Installation Methods
There are several ways to install and run the Microsoft Markitdown MCP Server. Let's explore each method in detail.
Method 1: Direct Installation via pip
The simplest way to install the Markitdown MCP server is using Python's package manager, pip:
pip install markitdown-mcp
This command installs the base MCP server package. However, to use all of Markitdown's features, you'll also need to install Markitdown with all optional dependencies:
pip install 'markitdown[all]'
Method 2: Installation from Source
For users who prefer to install from source or want the latest development version:
# Clone the repository
git clone https://github.com/microsoft/markitdown.git
# Navigate to the project directory
cd markitdown
# Install the main Markitdown package with all dependencies
pip install -e 'packages/markitdown[all]'
# Install the MCP server package
pip install -e packages/markitdown-mcp
Method 3: Docker Installation
Using Docker is recommended, especially when integrating with applications like Claude Desktop:
# Build the Docker image
docker build -t markitdown-mcp:latest -f packages/markitdown-mcp/Dockerfile .
# Run the container
docker run -it --rm markitdown-mcp:latest
Running the Markitdown MCP Server
After installation, you can run the server in different ways depending on your requirements:
Using STDIO (Standard Input/Output)
This is the default mode for running the MCP server:
markitdown-mcp
Using SSE (Server-Sent Events)
For web-based applications or when running as a network service:
markitdown-mcp --sse --host 127.0.0.1 --port 3001
By default, the server binds to localhost for security reasons.
Working with Local Files in Docker
To access local files when running the MCP server in Docker, you need to mount local directories into the container:
docker run -it --rm -v /path/to/local/data:/workdir markitdown-mcp:latest
This mounts the local directory /path/to/local/data
to /workdir
inside the container. For example, if you have a file named report.pdf
in your local directory, it will be accessible in the container at /workdir/report.pdf
.
Integrating with Claude Desktop
Claude Desktop is one application that works particularly well with the Markitdown MCP Server. Here's how to set it up:
Locate Claude's configuration file (
claude_desktop_config.json
) according to Claude's documentation.Edit the file to include the Markitdown MCP server configuration:
{
"mcpServers": {
"markitdown": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"markitdown-mcp:latest"
]
}
}
}
- If you need to access local files, include a volume mount:
{
"mcpServers": {
"markitdown": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"-v",
"/path/to/local/data:/workdir",
"markitdown-mcp:latest"
]
}
}
}
Customizing Optional Dependencies
Markitdown supports various file formats through optional dependencies. While installing with [all]
includes everything, you can customize the installation for your specific needs:
[pdf]
: For PDF files[pptx]
: For PowerPoint files[docx]
: For Word files[xlsx]
: For Excel files[xls]
: For older Excel files[outlook]
: For Outlook messages[az-doc-intel]
: For Azure Document Intelligence[audio-transcription]
: For audio transcription of WAV and MP3 files[youtube-transcription]
: For fetching YouTube video transcriptions
For example, if you only need PDF and Word support:
pip install 'markitdown[pdf,docx]'
Debugging the MCP Server
For troubleshooting and testing, you can use the MCP Inspector tool:
- Install the inspector:
npx @modelcontextprotocol/inspector
Connect to the inspector through the specified host and port (e.g., http://localhost:5173/).
For STDIO connections:
- Select "STDIO" as the transport type
- Enter "markitdown-mcp" as the command
- Click "Connect"
For SSE connections:
- Select "SSE" as the transport type
- Enter "http://127.0.0.1:3001/sse" as the URL
- Click "Connect"
Navigate to the Tools tab, click "List Tools", select "convert_to_markdown", and test with a valid URI.
Security Considerations
It's important to note that the Markitdown MCP server does not support authentication and runs with the privileges of the user executing it. For this reason, when using SSE mode, it's recommended to bind the server only to localhost (the default setting) to prevent unauthorized access.
Conclusion
Microsoft Markitdown MCP Server provides a powerful way to convert various file formats to Markdown for use with LLM applications. Whether you're using it for document analysis, content extraction, or as part of a larger AI workflow, the server's flexible installation options and straightforward integration make it a valuable tool in the modern AI ecosystem.
By following this installation guide, you can quickly set up the Markitdown MCP Server and begin leveraging its capabilities for your specific use case. Its support for numerous file formats and seamless integration with applications like Claude Desktop make it an essential component for anyone working with LLMs and document processing workflows.
For more information, updates, and community support, visit the official GitHub repository at https://github.com/microsoft/markitdown.