--- title: TextLens - AI-Powered OCR emoji: ๐ colorFrom: blue colorTo: purple sdk: gradio sdk_version: 4.0.0 app_file: app.py pinned: false license: mit --- # ๐ TextLens - AI-Powered OCR [](https://huggingface.co/spaces/GoConqurer/textlens-ocr) [](https://github.com/KumarAmrit30/textlens-ocr) [](https://www.python.org/downloads/) A state-of-the-art Vision-Language Model (VLM) based OCR application that extracts text from images using Microsoft Florence-2 with intelligent fallback systems and enterprise-grade zero downtime deployment. ## ๐ Live Demo **๐ Try it now:** [https://huggingface.co/spaces/GoConqurer/textlens-ocr](https://huggingface.co/spaces/GoConqurer/textlens-ocr)  ## โจ Key Features ### ๐ค Advanced AI-Powered OCR - **Microsoft Florence-2 VLM**: State-of-the-art vision-language model for text extraction - **Intelligent Fallback System**: Automatic fallback to EasyOCR if primary model fails - **Multi-Model Support**: Florence-2-base and Florence-2-large variants - **Real-time Processing**: Instant text extraction on image upload ### ๐จ Modern User Experience - **Clean UI**: Professional Gradio interface with intuitive design - **Multiple Input Methods**: Upload files, use webcam, or paste from clipboard - **Copy-to-Clipboard**: One-click text copying functionality - **Responsive Design**: Works seamlessly on desktop and mobile devices - **Dark/Light Theme**: Automatic theme adaptation ### โก Performance & Reliability - **GPU Acceleration**: Supports CUDA, MPS (Apple Silicon), and CPU inference - **Smart Device Detection**: Automatically uses best available hardware - **Error Resilience**: Robust error handling with graceful degradation - **Memory Optimization**: Efficient model loading and cleanup ### ๐ก๏ธ Enterprise Features - **Zero Downtime Deployment**: Blue-green deployment with health checks - **Health Monitoring**: Built-in `/health` and `/ready` endpoints - **Graceful Shutdown**: Signal handling for clean application restarts - **Production Ready**: Scalable architecture with automated deployment ## ๐ Quick Start ### ๐ Online (Recommended) **Instant access** - No installation required: ๐ [**Launch TextLens**](https://huggingface.co/spaces/GoConqurer/textlens-ocr) ### ๐ป Local Development 1. **Clone Repository** ```bash git clone https://github.com/KumarAmrit30/textlens-ocr.git cd textlens-ocr ``` 2. **Setup Environment** ```bash python -m venv textlens_env source textlens_env/bin/activate # Windows: textlens_env\Scripts\activate pip install -r requirements.txt ``` 3. **Launch Application** ```bash python app.py ``` ๐ Open: `http://localhost:7860` ### ๐งช Quick Test ```bash # Verify installation python -c "from models.ocr_processor import OCRProcessor; print('โ TextLens ready!')" ``` ## ๐ Model Performance | Model | Size | Speed | Accuracy | Best For | | -------------------- | ----- | --------- | ------------ | ---------------------- | | **Florence-2-base** | 270M | โก Fast | ๐ High | General OCR, Real-time | | **Florence-2-large** | 770M | ๐ Medium | ๐ Very High | High accuracy needs | | **EasyOCR** | ~100M | ๐ Medium | ๐ Good | Fallback, Multilingual | ## ๐ฏ Supported Use Cases | Category | Examples | Performance | | ------------------- | ------------------------------- | ----------- | | ๐ **Documents** | PDFs, Scanned papers, Forms | โญโญโญโญโญ | | ๐งพ **Receipts** | Shopping receipts, Invoices | โญโญโญโญ | | ๐ฑ **Screenshots** | App interfaces, Error messages | โญโญโญโญโญ | | ๐ **Vehicle** | License plates, VIN numbers | โญโญโญโญ | | ๐ **Books** | Printed text, Handwritten notes | โญโญโญโญ | | ๐ **Multilingual** | Multiple languages | โญโญโญ | ## ๐ง Configuration ### ๐๏ธ Model Selection ```python from models.ocr_processor import OCRProcessor # Fast inference (recommended) ocr = OCRProcessor(model_name="microsoft/Florence-2-base") # Maximum accuracy ocr = OCRProcessor(model_name="microsoft/Florence-2-large") ``` ### ๐จ UI Customization Modify `ui/styles.py` to customize appearance: ```python # Change color scheme PRIMARY_COLOR = "#1f77b4" SECONDARY_COLOR = "#ff7f0e" # Update layout INTERFACE_WIDTH = "100%" ``` ### โ๏ธ Environment Variables | Variable | Description | Default | | ---------------------- | -------------------- | ---------------------- | | `SPACE_ID` | HuggingFace Space ID | Auto-detected | | `DEPLOYMENT_STAGE` | deployment stage | `production` | | `TRANSFORMERS_CACHE` | Model cache path | `~/.cache/huggingface` | | `CUDA_VISIBLE_DEVICES` | GPU selection | All available | **Deployment Flow:** ```mermaid graph LR A[Code Push] --> B[Validate] B --> C[Deploy Staging] C --> D[Health Check] D --> E[Deploy Production] E --> F[Verify] F --> G[Complete โ ] ``` ## ๐ค Contributing We welcome contributions! Here's how to get started: ### ๐ง Development Setup 1. **Fork & Clone** ```bash git clone https://github.com/YOUR_USERNAME/textlens-ocr.git cd textlens-ocr ``` 2. **Create Branch** ```bash git checkout -b feature/your-feature-name ``` 3. **Make Changes** - Add new features or fix bugs - Update tests and documentation - Follow code style guidelines 4. **Test Changes** ```bash python -m pytest tests/ python -c "from models.ocr_processor import OCRProcessor; OCRProcessor()" ``` 5. **Submit PR** ```bash git add . git commit -m "feat: add your feature description" git push origin feature/your-feature-name ``` ### ๐ Contribution Guidelines - **Code Style**: Follow PEP 8, use Black formatter - **Documentation**: Update README and docstrings - **Tests**: Add tests for new functionality - **Commits**: Use conventional commit messages - **Issues**: Link PRs to relevant issues ## ๐ License This project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details. ### ๐ Third-Party Licenses - **Microsoft Florence-2**: [MIT License](https://github.com/microsoft/Florence) - **HuggingFace Transformers**: [Apache License 2.0](https://github.com/huggingface/transformers) - **Gradio**: [Apache License 2.0](https://github.com/gradio-app/gradio) - **EasyOCR**: [Apache License 2.0](https://github.com/JaidedAI/EasyOCR) ## ๐ Acknowledgments Special thanks to: - **Microsoft Research** for the incredible Florence-2 vision-language model - **HuggingFace** for the transformers library and Spaces platform - **Gradio Team** for the amazing web interface framework - **JaidedAI** for EasyOCR fallback capabilities - **Open Source Community** for continuous support and contributions ## ๐ Project Status | Component | Status | Version | | ----------------- | ------------- | ------- | | **Core OCR** | โ Stable | v1.0.0 | | **Web UI** | โ Stable | v1.0.0 | | **Deployment** | โ Production | v1.0.0 | | **API** | โ Stable | v1.0.0 | | **Documentation** | โ Complete | v1.0.0 | ### ๐ Stats    ---