# OCR Quality Assessment Pipeline Demo This demo showcases the **OCR Quality Assessment Pipeline** from the Impresso project, which analyzes and improves text extracted from OCR (Optical Character Recognition). ## Features - **OCR Error Detection**: Identifies common OCR mistakes and artifacts - **Quality Assessment**: Evaluates the overall quality of OCR text - **Text Correction**: Suggests improvements for detected errors - **Interactive Interface**: User-friendly Gradio web interface ## Usage The demo accepts OCR text input and provides: - Quality assessment scores - Detected OCR errors - Suggested corrections - Processed/improved text ## Example Try the provided German text example that contains typical OCR errors like: - Character misrecognition (e.g., "Zaubrisch" instead of "Zauberisch") - Spacing issues (e.g., "nacb" instead of "nach") - Punctuation errors (e.g., "d:m" instead of "dem") ## Installation ```bash pip install -r requirements.txt python app.py ``` The demo will be available at `http://localhost:7860`