--- title: TextLens - AI-Powered OCR emoji: ๐Ÿ” colorFrom: blue colorTo: purple sdk: gradio sdk_version: 4.0.0 app_file: app.py pinned: false license: mit --- # ๐Ÿ” TextLens - AI-Powered OCR [![Deploy to HuggingFace](https://img.shields.io/badge/๐Ÿค—-Deploy%20to%20Spaces-blue)](https://huggingface.co/spaces/GoConqurer/textlens-ocr) [![GitHub](https://img.shields.io/badge/GitHub-Repository-green)](https://github.com/KumarAmrit30/textlens-ocr) [![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/) A state-of-the-art Vision-Language Model (VLM) based OCR application that extracts text from images using Microsoft Florence-2 with intelligent fallback systems and enterprise-grade zero downtime deployment. ## ๐Ÿš€ Live Demo **๐Ÿ”— Try it now:** [https://huggingface.co/spaces/GoConqurer/textlens-ocr](https://huggingface.co/spaces/GoConqurer/textlens-ocr) ![TextLens Demo](https://img.shields.io/badge/Demo-Live-brightgreen) ## โœจ Key Features ### ๐Ÿค– Advanced AI-Powered OCR - **Microsoft Florence-2 VLM**: State-of-the-art vision-language model for text extraction - **Intelligent Fallback System**: Automatic fallback to EasyOCR if primary model fails - **Multi-Model Support**: Florence-2-base and Florence-2-large variants - **Real-time Processing**: Instant text extraction on image upload ### ๐ŸŽจ Modern User Experience - **Clean UI**: Professional Gradio interface with intuitive design - **Multiple Input Methods**: Upload files, use webcam, or paste from clipboard - **Copy-to-Clipboard**: One-click text copying functionality - **Responsive Design**: Works seamlessly on desktop and mobile devices - **Dark/Light Theme**: Automatic theme adaptation ### โšก Performance & Reliability - **GPU Acceleration**: Supports CUDA, MPS (Apple Silicon), and CPU inference - **Smart Device Detection**: Automatically uses best available hardware - **Error Resilience**: Robust error handling with graceful degradation - **Memory Optimization**: Efficient model loading and cleanup ### ๐Ÿ›ก๏ธ Enterprise Features - **Zero Downtime Deployment**: Blue-green deployment with health checks - **Health Monitoring**: Built-in `/health` and `/ready` endpoints - **Graceful Shutdown**: Signal handling for clean application restarts - **Production Ready**: Scalable architecture with automated deployment ## ๐Ÿš€ Quick Start ### ๐ŸŒ Online (Recommended) **Instant access** - No installation required: ๐Ÿ‘‰ [**Launch TextLens**](https://huggingface.co/spaces/GoConqurer/textlens-ocr) ### ๐Ÿ’ป Local Development 1. **Clone Repository** ```bash git clone https://github.com/KumarAmrit30/textlens-ocr.git cd textlens-ocr ``` 2. **Setup Environment** ```bash python -m venv textlens_env source textlens_env/bin/activate # Windows: textlens_env\Scripts\activate pip install -r requirements.txt ``` 3. **Launch Application** ```bash python app.py ``` ๐ŸŒ Open: `http://localhost:7860` ### ๐Ÿงช Quick Test ```bash # Verify installation python -c "from models.ocr_processor import OCRProcessor; print('โœ… TextLens ready!')" ``` ## ๐Ÿ“Š Model Performance | Model | Size | Speed | Accuracy | Best For | | -------------------- | ----- | --------- | ------------ | ---------------------- | | **Florence-2-base** | 270M | โšก Fast | ๐Ÿ“ˆ High | General OCR, Real-time | | **Florence-2-large** | 770M | ๐ŸŒ Medium | ๐Ÿ“Š Very High | High accuracy needs | | **EasyOCR** | ~100M | ๐Ÿš€ Medium | ๐Ÿ“‹ Good | Fallback, Multilingual | ## ๐ŸŽฏ Supported Use Cases | Category | Examples | Performance | | ------------------- | ------------------------------- | ----------- | | ๐Ÿ“„ **Documents** | PDFs, Scanned papers, Forms | โญโญโญโญโญ | | ๐Ÿงพ **Receipts** | Shopping receipts, Invoices | โญโญโญโญ | | ๐Ÿ“ฑ **Screenshots** | App interfaces, Error messages | โญโญโญโญโญ | | ๐Ÿš— **Vehicle** | License plates, VIN numbers | โญโญโญโญ | | ๐Ÿ“š **Books** | Printed text, Handwritten notes | โญโญโญโญ | | ๐ŸŒ **Multilingual** | Multiple languages | โญโญโญ | ## ๐Ÿ”ง Configuration ### ๐ŸŽ›๏ธ Model Selection ```python from models.ocr_processor import OCRProcessor # Fast inference (recommended) ocr = OCRProcessor(model_name="microsoft/Florence-2-base") # Maximum accuracy ocr = OCRProcessor(model_name="microsoft/Florence-2-large") ``` ### ๐ŸŽจ UI Customization Modify `ui/styles.py` to customize appearance: ```python # Change color scheme PRIMARY_COLOR = "#1f77b4" SECONDARY_COLOR = "#ff7f0e" # Update layout INTERFACE_WIDTH = "100%" ``` ### โš™๏ธ Environment Variables | Variable | Description | Default | | ---------------------- | -------------------- | ---------------------- | | `SPACE_ID` | HuggingFace Space ID | Auto-detected | | `DEPLOYMENT_STAGE` | deployment stage | `production` | | `TRANSFORMERS_CACHE` | Model cache path | `~/.cache/huggingface` | | `CUDA_VISIBLE_DEVICES` | GPU selection | All available | **Deployment Flow:** ```mermaid graph LR A[Code Push] --> B[Validate] B --> C[Deploy Staging] C --> D[Health Check] D --> E[Deploy Production] E --> F[Verify] F --> G[Complete โœ…] ``` ## ๐Ÿค Contributing We welcome contributions! Here's how to get started: ### ๐Ÿ”ง Development Setup 1. **Fork & Clone** ```bash git clone https://github.com/YOUR_USERNAME/textlens-ocr.git cd textlens-ocr ``` 2. **Create Branch** ```bash git checkout -b feature/your-feature-name ``` 3. **Make Changes** - Add new features or fix bugs - Update tests and documentation - Follow code style guidelines 4. **Test Changes** ```bash python -m pytest tests/ python -c "from models.ocr_processor import OCRProcessor; OCRProcessor()" ``` 5. **Submit PR** ```bash git add . git commit -m "feat: add your feature description" git push origin feature/your-feature-name ``` ### ๐Ÿ“ Contribution Guidelines - **Code Style**: Follow PEP 8, use Black formatter - **Documentation**: Update README and docstrings - **Tests**: Add tests for new functionality - **Commits**: Use conventional commit messages - **Issues**: Link PRs to relevant issues ## ๐Ÿ“„ License This project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details. ### ๐Ÿ™ Third-Party Licenses - **Microsoft Florence-2**: [MIT License](https://github.com/microsoft/Florence) - **HuggingFace Transformers**: [Apache License 2.0](https://github.com/huggingface/transformers) - **Gradio**: [Apache License 2.0](https://github.com/gradio-app/gradio) - **EasyOCR**: [Apache License 2.0](https://github.com/JaidedAI/EasyOCR) ## ๐ŸŒŸ Acknowledgments Special thanks to: - **Microsoft Research** for the incredible Florence-2 vision-language model - **HuggingFace** for the transformers library and Spaces platform - **Gradio Team** for the amazing web interface framework - **JaidedAI** for EasyOCR fallback capabilities - **Open Source Community** for continuous support and contributions ## ๐Ÿ“ˆ Project Status | Component | Status | Version | | ----------------- | ------------- | ------- | | **Core OCR** | โœ… Stable | v1.0.0 | | **Web UI** | โœ… Stable | v1.0.0 | | **Deployment** | โœ… Production | v1.0.0 | | **API** | โœ… Stable | v1.0.0 | | **Documentation** | โœ… Complete | v1.0.0 | ### ๐Ÿ“Š Stats ![GitHub stars](https://img.shields.io/github/stars/KumarAmrit30/textlens-ocr?style=social) ![GitHub forks](https://img.shields.io/github/forks/KumarAmrit30/textlens-ocr?style=social) ![GitHub watchers](https://img.shields.io/github/watchers/KumarAmrit30/textlens-ocr?style=social) ---
**Made with โค๏ธ for the AI community** [โญ Star this repo](https://github.com/KumarAmrit30/textlens-ocr) โ€ข [๐Ÿ”— Try the demo](https://huggingface.co/spaces/GoConqurer/textlens-ocr) โ€ข [๐Ÿ“– Read docs](DEPLOYMENT.md)