Spaces:

Agents-MCP-Hackathon
/

htrflow_mcp

Running

File size: 3,005 Bytes

cfb37bf
 
 
 
 
 
 
 
8dcf777
 
 
 
 
 
cfb37bf
8dcf777
 
cfb37bf
 
e4a5b87
 
eacda18
e4a5b87
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ac22361
e4a5b87

---
title: Htrflow Mcp
emoji: 🔥
colorFrom: green
colorTo: gray
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
tags:
  - mcp-server-track
  - htrflow
  - htr
  - ocr
  - api
pinned: false
license: apache-2.0
short_description: Image to text, alto- or page-xml
---

Video showcase:

<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/60a4e677917119d38f6bbff8/tMp-t2pV3t3HABW0YLbMy.mp4"></video>


## MCP tooling

- htr_text: Extract plain text from handwritten documents

Parameters: image_path (string), document_type (string, default: "letter_swedish"), custom_settings (optional JSON string)
Returns: Extracted text as string


- htrflow_file: Process HTR and return formatted files

Parameters: image_path (string), document_type (string), output_format (string, default: "alto"), custom_settings (optional JSON), server_name (string)
Returns: Downloadable file in specified format
Supported formats: txt, alto, page, json


- htrflow_visualizer: Visualize HTR results on original image

Parameters: image_path (string), htr_document_path (string), server_name (string)
Returns: Visualization image with text regions highlighted


![image/png](https://cdn-uploads.huggingface.co/production/uploads/60a4e677917119d38f6bbff8/W1lfkOp_xazuhoWtrCFb-.png)

Claude Desktop

```json
{
  "mcpServers": {
    "htrflow": {
      "command": "npx",
      "args": [
        "mcp-remote",
        "https://[YOUR-USERNAME].hf.space/gradio_api/mcp/sse",
        "--transport",
        "sse-only"
      ]
    }
  }
}
```

## Usage Examples
- Can you extract the text from this handwritten Swedish letter? [upload image]
- Process this handwritten document and return the results in ALTO XML format for archival purposes.
- Show me the HTR results overlaid on the original image so I can see how accurate the text detection was.


### Standard Letter Processing
Segmentation: Detect text lines using YOLO
Text Recognition: Extract text using TrOCR
Line Ordering: Organize text in reading order

### Spread Processing
Region Segmentation: Detect page regions
Line Segmentation: Detect text lines within regions
Text Recognition: Extract text using TrOCR
Reading Order: Handle marginalia and two-page layout

Custom Settings
You can provide custom pipeline settings as JSON:

```json
{
  "steps": [
    {
      "step": "Segmentation",
      "settings": {
        "model": "yolo",
        "model_settings": {
          "model": "Riksarkivet/yolov9-lines-within-regions-1"
        },
        "generation_settings": {"batch_size": 8}
      }
    },
    {
      "step": "TextRecognition", 
      "settings": {
        "model": "TrOCR",
        "model_settings": {
          "model": "microsoft/trocr-base-handwritten"
        },
        "generation_settings": {"batch_size": 16}
      }
    }
  ]
}
```

Not enough time but would also integrate the iiif part aswell:
https://github.com/AI-Riksarkivet/oxenstierna
https://huggingface.co/collections/Riksarkivet/mcps-68447208f9eddd623a83fbc9