Spaces:
Running
Running
Commit
Β·
760f6ef
1
Parent(s):
67e2508
π§ Update OCR model in handlers.py and clean up README.md
Browse filesβ
Changes:
- Updated OCR model from "microsoft/Florence-2-base" to "microsoft/Florence-2-large" for improved performance.
- Removed outdated architecture and deployment sections from README.md for clarity and conciseness.
π This enhances the application's capabilities and streamlines documentation.
- README.md +0 -241
- ui/handlers.py +1 -1
README.md
CHANGED
|
@@ -14,7 +14,6 @@ license: mit
|
|
| 14 |
|
| 15 |
[](https://huggingface.co/spaces/GoConqurer/textlens-ocr)
|
| 16 |
[](https://github.com/KumarAmrit30/textlens-ocr)
|
| 17 |
-
[](LICENSE)
|
| 18 |
[](https://www.python.org/downloads/)
|
| 19 |
|
| 20 |
A state-of-the-art Vision-Language Model (VLM) based OCR application that extracts text from images using Microsoft Florence-2 with intelligent fallback systems and enterprise-grade zero downtime deployment.
|
|
@@ -56,29 +55,6 @@ A state-of-the-art Vision-Language Model (VLM) based OCR application that extrac
|
|
| 56 |
- **Graceful Shutdown**: Signal handling for clean application restarts
|
| 57 |
- **Production Ready**: Scalable architecture with automated deployment
|
| 58 |
|
| 59 |
-
## ποΈ Architecture
|
| 60 |
-
|
| 61 |
-
```
|
| 62 |
-
textlens-ocr/
|
| 63 |
-
βββ π± Frontend (Gradio UI)
|
| 64 |
-
β βββ ui/interface.py # Main interface components
|
| 65 |
-
β βββ ui/handlers.py # Event handlers & logic
|
| 66 |
-
β βββ ui/styles.py # CSS styling & themes
|
| 67 |
-
βββ π§ AI Models
|
| 68 |
-
β βββ models/ocr_processor.py # OCR engine with fallbacks
|
| 69 |
-
βββ π§ Utilities
|
| 70 |
-
β βββ utils/image_utils.py # Image preprocessing
|
| 71 |
-
βββ π Deployment
|
| 72 |
-
β βββ .github/workflows/ # CI/CD pipelines
|
| 73 |
-
β βββ scripts/deploy.py # Manual deployment tools
|
| 74 |
-
β βββ deployment.config.yml # Deployment configuration
|
| 75 |
-
βββ π Documentation
|
| 76 |
-
β βββ README.md # Main documentation
|
| 77 |
-
β βββ DEPLOYMENT.md # Deployment guide
|
| 78 |
-
βββ βοΈ Configuration
|
| 79 |
-
βββ app.py # Main application entry
|
| 80 |
-
βββ requirements.txt # Dependencies
|
| 81 |
-
```
|
| 82 |
|
| 83 |
## π Quick Start
|
| 84 |
|
|
@@ -172,41 +148,7 @@ INTERFACE_WIDTH = "100%"
|
|
| 172 |
| `TRANSFORMERS_CACHE` | Model cache path | `~/.cache/huggingface` |
|
| 173 |
| `CUDA_VISIBLE_DEVICES` | GPU selection | All available |
|
| 174 |
|
| 175 |
-
## π Deployment
|
| 176 |
-
|
| 177 |
-
### π€ HuggingFace Spaces (Recommended)
|
| 178 |
-
|
| 179 |
-
**Automatic Deployment:**
|
| 180 |
-
|
| 181 |
-
1. Fork this repository
|
| 182 |
-
2. Push to `main`/`master` branch
|
| 183 |
-
3. GitHub Actions automatically deploys to HuggingFace Spaces
|
| 184 |
-
4. Access your deployed app at: `https://huggingface.co/spaces/USERNAME/textlens-ocr`
|
| 185 |
-
|
| 186 |
-
**Manual Deployment:**
|
| 187 |
-
|
| 188 |
-
1. Go to [GitHub Actions](https://github.com/KumarAmrit30/textlens-ocr/actions)
|
| 189 |
-
2. Select "Deploy to HuggingFace Spaces"
|
| 190 |
-
3. Click "Run workflow"
|
| 191 |
-
4. Choose deployment type:
|
| 192 |
-
- **Direct**: Quick deployment to production
|
| 193 |
-
- **Blue-Green**: Zero downtime with staging validation
|
| 194 |
-
|
| 195 |
-
### π Zero Downtime Deployment
|
| 196 |
-
|
| 197 |
-
Our enterprise-grade deployment system ensures **zero downtime** for users:
|
| 198 |
|
| 199 |
-
**Features:**
|
| 200 |
-
|
| 201 |
-
- π΅ **Blue-Green Deployment**: Test in staging before production
|
| 202 |
-
- π₯ **Health Monitoring**: Automatic health checks with retry logic
|
| 203 |
-
- π **Graceful Shutdown**: Clean application restarts
|
| 204 |
-
- π **Real-time Monitoring**: Deployment status tracking
|
| 205 |
-
|
| 206 |
-
**Health Endpoints:**
|
| 207 |
-
|
| 208 |
-
- `GET /health` - Application health status
|
| 209 |
-
- `GET /ready` - Application readiness check
|
| 210 |
|
| 211 |
**Deployment Flow:**
|
| 212 |
|
|
@@ -220,170 +162,6 @@ graph LR
|
|
| 220 |
F --> G[Complete β
]
|
| 221 |
```
|
| 222 |
|
| 223 |
-
### π³ Docker Deployment
|
| 224 |
-
|
| 225 |
-
```dockerfile
|
| 226 |
-
FROM python:3.9-slim
|
| 227 |
-
|
| 228 |
-
WORKDIR /app
|
| 229 |
-
COPY requirements.txt .
|
| 230 |
-
RUN pip install -r requirements.txt
|
| 231 |
-
|
| 232 |
-
COPY . .
|
| 233 |
-
EXPOSE 7860
|
| 234 |
-
|
| 235 |
-
CMD ["python", "app.py"]
|
| 236 |
-
```
|
| 237 |
-
|
| 238 |
-
Build and run:
|
| 239 |
-
|
| 240 |
-
```bash
|
| 241 |
-
docker build -t textlens-ocr .
|
| 242 |
-
docker run -p 7860:7860 textlens-ocr
|
| 243 |
-
```
|
| 244 |
-
|
| 245 |
-
### βοΈ Cloud Platforms
|
| 246 |
-
|
| 247 |
-
| Platform | Status | Guide |
|
| 248 |
-
| ---------------------- | ------------- | ------------------------------------------------------------------- |
|
| 249 |
-
| **HuggingFace Spaces** | β
Ready | [Deploy Now](https://huggingface.co/spaces/GoConqurer/textlens-ocr) |
|
| 250 |
-
| **Google Colab** | β
Compatible | Open in Colab |
|
| 251 |
-
| **AWS/GCP/Azure** | π§ Docker | Use Docker deployment |
|
| 252 |
-
| **Heroku** | β οΈ Limited | GPU not available |
|
| 253 |
-
|
| 254 |
-
## π§ͺ Testing & Development
|
| 255 |
-
|
| 256 |
-
### π Running Tests
|
| 257 |
-
|
| 258 |
-
```bash
|
| 259 |
-
# Basic functionality test
|
| 260 |
-
python -c "
|
| 261 |
-
from models.ocr_processor import OCRProcessor
|
| 262 |
-
ocr = OCRProcessor()
|
| 263 |
-
print(f'β
Model loaded: {ocr.get_model_info()}')
|
| 264 |
-
"
|
| 265 |
-
|
| 266 |
-
# Test with sample image
|
| 267 |
-
python -c "
|
| 268 |
-
from PIL import Image
|
| 269 |
-
from models.ocr_processor import OCRProcessor
|
| 270 |
-
import requests
|
| 271 |
-
|
| 272 |
-
# Download test image
|
| 273 |
-
img_url = 'https://via.placeholder.com/300x100/000000/FFFFFF?text=Hello+World'
|
| 274 |
-
image = Image.open(requests.get(img_url, stream=True).raw)
|
| 275 |
-
|
| 276 |
-
# Test OCR
|
| 277 |
-
ocr = OCRProcessor()
|
| 278 |
-
result = ocr.extract_text(image)
|
| 279 |
-
print(f'β
OCR Result: {result}')
|
| 280 |
-
"
|
| 281 |
-
```
|
| 282 |
-
|
| 283 |
-
### π οΈ Development Tools
|
| 284 |
-
|
| 285 |
-
```bash
|
| 286 |
-
# Install development dependencies
|
| 287 |
-
pip install -r requirements.txt
|
| 288 |
-
|
| 289 |
-
# Format code
|
| 290 |
-
black . --line-length 88
|
| 291 |
-
|
| 292 |
-
# Type checking
|
| 293 |
-
mypy models/ utils/ ui/
|
| 294 |
-
|
| 295 |
-
# Lint code
|
| 296 |
-
flake8 --max-line-length 88
|
| 297 |
-
```
|
| 298 |
-
|
| 299 |
-
## π API Reference
|
| 300 |
-
|
| 301 |
-
### OCRProcessor Class
|
| 302 |
-
|
| 303 |
-
```python
|
| 304 |
-
from models.ocr_processor import OCRProcessor
|
| 305 |
-
|
| 306 |
-
# Initialize processor
|
| 307 |
-
ocr = OCRProcessor(
|
| 308 |
-
model_name="microsoft/Florence-2-base", # Model selection
|
| 309 |
-
device=None, # Auto-detect device
|
| 310 |
-
torch_dtype=None # Auto-select dtype
|
| 311 |
-
)
|
| 312 |
-
|
| 313 |
-
# Extract text from image
|
| 314 |
-
text = ocr.extract_text(image)
|
| 315 |
-
# Returns: str
|
| 316 |
-
|
| 317 |
-
# Extract text with bounding boxes
|
| 318 |
-
result = ocr.extract_text_with_regions(image)
|
| 319 |
-
# Returns: dict with text and regions
|
| 320 |
-
|
| 321 |
-
# Get model information
|
| 322 |
-
info = ocr.get_model_info()
|
| 323 |
-
# Returns: dict with model details
|
| 324 |
-
|
| 325 |
-
# Cleanup resources
|
| 326 |
-
ocr.cleanup()
|
| 327 |
-
```
|
| 328 |
-
|
| 329 |
-
### Health Check API
|
| 330 |
-
|
| 331 |
-
```bash
|
| 332 |
-
# Check application health
|
| 333 |
-
curl https://huggingface.co/spaces/GoConqurer/textlens-ocr/health
|
| 334 |
-
|
| 335 |
-
# Response:
|
| 336 |
-
{
|
| 337 |
-
"status": "healthy",
|
| 338 |
-
"timestamp": 1640995200,
|
| 339 |
-
"version": "1.0.0",
|
| 340 |
-
"environment": "production"
|
| 341 |
-
}
|
| 342 |
-
|
| 343 |
-
# Check readiness
|
| 344 |
-
curl https://huggingface.co/spaces/GoConqurer/textlens-ocr/ready
|
| 345 |
-
|
| 346 |
-
# Response:
|
| 347 |
-
{
|
| 348 |
-
"status": "ready",
|
| 349 |
-
"timestamp": 1640995200
|
| 350 |
-
}
|
| 351 |
-
```
|
| 352 |
-
|
| 353 |
-
## π¨ Troubleshooting
|
| 354 |
-
|
| 355 |
-
### Common Issues
|
| 356 |
-
|
| 357 |
-
| Issue | Symptoms | Solution |
|
| 358 |
-
| ----------------------- | ------------------------ | --------------------------------------- |
|
| 359 |
-
| **Model Loading Error** | ImportError, CUDA errors | Check GPU drivers, install CUDA toolkit |
|
| 360 |
-
| **Memory Error** | Out of memory | Reduce batch size, use CPU inference |
|
| 361 |
-
| **SSL Certificate** | SSL errors on macOS | Run certificate update command |
|
| 362 |
-
| **Permission Error** | File access denied | Check file permissions, run as admin |
|
| 363 |
-
|
| 364 |
-
### Debug Commands
|
| 365 |
-
|
| 366 |
-
```bash
|
| 367 |
-
# Check CUDA availability
|
| 368 |
-
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}')"
|
| 369 |
-
|
| 370 |
-
# Check transformers version
|
| 371 |
-
python -c "import transformers; print(f'Transformers: {transformers.__version__}')"
|
| 372 |
-
|
| 373 |
-
# Test health endpoint locally
|
| 374 |
-
curl http://localhost:7860/health
|
| 375 |
-
|
| 376 |
-
# View application logs
|
| 377 |
-
tail -f textlens.log
|
| 378 |
-
```
|
| 379 |
-
|
| 380 |
-
### Getting Help
|
| 381 |
-
|
| 382 |
-
1. π **Check existing issues**: [GitHub Issues](https://github.com/KumarAmrit30/textlens-ocr/issues)
|
| 383 |
-
2. π **Create new issue**: Provide error details and environment info
|
| 384 |
-
3. π¬ **Join discussion**: [GitHub Discussions](https://github.com/KumarAmrit30/textlens-ocr/discussions)
|
| 385 |
-
4. π§ **Contact**: Create an issue for direct support
|
| 386 |
-
|
| 387 |
## π€ Contributing
|
| 388 |
|
| 389 |
We welcome contributions! Here's how to get started:
|
|
@@ -462,25 +240,6 @@ Special thanks to:
|
|
| 462 |
| **API** | β
Stable | v1.0.0 |
|
| 463 |
| **Documentation** | β
Complete | v1.0.0 |
|
| 464 |
|
| 465 |
-
### π― Roadmap
|
| 466 |
-
|
| 467 |
-
- [ ] **Multi-language UI** support
|
| 468 |
-
- [ ] **Batch processing** for multiple images
|
| 469 |
-
- [ ] **API rate limiting** and authentication
|
| 470 |
-
- [ ] **Custom model** fine-tuning support
|
| 471 |
-
- [ ] **Mobile app** development
|
| 472 |
-
- [ ] **Cloud storage** integration
|
| 473 |
-
|
| 474 |
-
## π Support & Community
|
| 475 |
-
|
| 476 |
-
### π Links
|
| 477 |
-
|
| 478 |
-
- **π Homepage**: [GitHub Repository](https://github.com/KumarAmrit30/textlens-ocr)
|
| 479 |
-
- **π Live Demo**: [HuggingFace Spaces](https://huggingface.co/spaces/GoConqurer/textlens-ocr)
|
| 480 |
-
- **π Issues**: [Report Bugs](https://github.com/KumarAmrit30/textlens-ocr/issues)
|
| 481 |
-
- **π¬ Discussions**: [GitHub Discussions](https://github.com/KumarAmrit30/textlens-ocr/discussions)
|
| 482 |
-
- **π Documentation**: [Deployment Guide](DEPLOYMENT.md)
|
| 483 |
-
|
| 484 |
### π Stats
|
| 485 |
|
| 486 |

|
|
|
|
| 14 |
|
| 15 |
[](https://huggingface.co/spaces/GoConqurer/textlens-ocr)
|
| 16 |
[](https://github.com/KumarAmrit30/textlens-ocr)
|
|
|
|
| 17 |
[](https://www.python.org/downloads/)
|
| 18 |
|
| 19 |
A state-of-the-art Vision-Language Model (VLM) based OCR application that extracts text from images using Microsoft Florence-2 with intelligent fallback systems and enterprise-grade zero downtime deployment.
|
|
|
|
| 55 |
- **Graceful Shutdown**: Signal handling for clean application restarts
|
| 56 |
- **Production Ready**: Scalable architecture with automated deployment
|
| 57 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
|
| 59 |
## π Quick Start
|
| 60 |
|
|
|
|
| 148 |
| `TRANSFORMERS_CACHE` | Model cache path | `~/.cache/huggingface` |
|
| 149 |
| `CUDA_VISIBLE_DEVICES` | GPU selection | All available |
|
| 150 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 151 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 152 |
|
| 153 |
**Deployment Flow:**
|
| 154 |
|
|
|
|
| 162 |
F --> G[Complete β
]
|
| 163 |
```
|
| 164 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 165 |
## π€ Contributing
|
| 166 |
|
| 167 |
We welcome contributions! Here's how to get started:
|
|
|
|
| 240 |
| **API** | β
Stable | v1.0.0 |
|
| 241 |
| **Documentation** | β
Complete | v1.0.0 |
|
| 242 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 243 |
### π Stats
|
| 244 |
|
| 245 |

|
ui/handlers.py
CHANGED
|
@@ -16,7 +16,7 @@ def initialize_ocr_processor():
|
|
| 16 |
global ocr_processor
|
| 17 |
try:
|
| 18 |
logger.info("Initializing OCR processor...")
|
| 19 |
-
ocr_processor = OCRProcessor(model_name="microsoft/Florence-2-
|
| 20 |
return True
|
| 21 |
except Exception as e:
|
| 22 |
logger.error(f"Failed to initialize OCR processor: {str(e)}")
|
|
|
|
| 16 |
global ocr_processor
|
| 17 |
try:
|
| 18 |
logger.info("Initializing OCR processor...")
|
| 19 |
+
ocr_processor = OCRProcessor(model_name="microsoft/Florence-2-large")
|
| 20 |
return True
|
| 21 |
except Exception as e:
|
| 22 |
logger.error(f"Failed to initialize OCR processor: {str(e)}")
|