Spaces:

Agents-MCP-Hackathon
/

htrflow_mcp

Running

App Files Files Community

htrflow_mcp / README.md

Gabriel

Update README.md

ac22361 verified 14 days ago

preview code

raw

history blame

3.01 kB

	---
	title: Htrflow Mcp
	emoji: 🔥
	colorFrom: green
	colorTo: gray
	sdk: gradio
	sdk_version: 5.33.0
	app_file: app.py
	tags:
	- mcp-server-track
	- htrflow
	- htr
	- ocr
	- api
	pinned: false
	license: apache-2.0
	short_description: Image to text, alto- or page-xml
	---

	Video showcase:

	<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/60a4e677917119d38f6bbff8/tMp-t2pV3t3HABW0YLbMy.mp4"></video>


	## MCP tooling

	- htr_text: Extract plain text from handwritten documents

	Parameters: image_path (string), document_type (string, default: "letter_swedish"), custom_settings (optional JSON string)
	Returns: Extracted text as string


	- htrflow_file: Process HTR and return formatted files

	Parameters: image_path (string), document_type (string), output_format (string, default: "alto"), custom_settings (optional JSON), server_name (string)
	Returns: Downloadable file in specified format
	Supported formats: txt, alto, page, json


	- htrflow_visualizer: Visualize HTR results on original image

	Parameters: image_path (string), htr_document_path (string), server_name (string)
	Returns: Visualization image with text regions highlighted


	![image/png](https://cdn-uploads.huggingface.co/production/uploads/60a4e677917119d38f6bbff8/W1lfkOp_xazuhoWtrCFb-.png)

	Claude Desktop

	```json
	{
	"mcpServers": {
	"htrflow": {
	"command": "npx",
	"args": [
	"mcp-remote",
	"https://[YOUR-USERNAME].hf.space/gradio_api/mcp/sse",
	"--transport",
	"sse-only"
	]
	}
	}
	}
	```

	## Usage Examples
	- Can you extract the text from this handwritten Swedish letter? [upload image]
	- Process this handwritten document and return the results in ALTO XML format for archival purposes.
	- Show me the HTR results overlaid on the original image so I can see how accurate the text detection was.


	### Standard Letter Processing
	Segmentation: Detect text lines using YOLO
	Text Recognition: Extract text using TrOCR
	Line Ordering: Organize text in reading order

	### Spread Processing
	Region Segmentation: Detect page regions
	Line Segmentation: Detect text lines within regions
	Text Recognition: Extract text using TrOCR
	Reading Order: Handle marginalia and two-page layout

	Custom Settings
	You can provide custom pipeline settings as JSON:

	```json
	{
	"steps": [
	{
	"step": "Segmentation",
	"settings": {
	"model": "yolo",
	"model_settings": {
	"model": "Riksarkivet/yolov9-lines-within-regions-1"
	},
	"generation_settings": {"batch_size": 8}
	}
	},
	{
	"step": "TextRecognition",
	"settings": {
	"model": "TrOCR",
	"model_settings": {
	"model": "microsoft/trocr-base-handwritten"
	},
	"generation_settings": {"batch_size": 16}
	}
	}
	]
	}
	```

	Not enough time but would also integrate the iiif part aswell:
	https://github.com/AI-Riksarkivet/oxenstierna
	https://huggingface.co/collections/Riksarkivet/mcps-68447208f9eddd623a83fbc9