Manga_OCR / README.md
Drag2121's picture
tesseract
a91e387
---
title: My FastAPI App
emoji: πŸš€
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: "0.1.0"
app_file: app.py
pinned: false
---
# Manga OCR Translator
A powerful tool for translating manga, comics, and light novels using OCR (Optical Character Recognition) and machine translation.
## Features
- **Text Detection**: Automatically detects text in manga images using PyTesseract
- **Text Translation**: Translates detected text to your preferred language
- **PDF Support**: Upload and translate entire PDF manga files
- **URL Support**: Translate manga directly from web URLs
- **Modern UI**: Clean, responsive Gumroad-like user interface
- **Multiple Languages**: Support for Japanese, Korean, Chinese, and more
## Requirements
Before running the application, ensure you have the following installed:
- Python 3.8+
- [Tesseract OCR](https://github.com/tesseract-ocr/tesseract) - Must be installed and available in PATH
- Required Python packages from `requirements.txt`
### Installing Tesseract OCR
#### Windows
1. Download the installer from [UB Mannheim](https://github.com/UB-Mannheim/tesseract/wiki)
2. Install and add to PATH environment variable
3. Default installation path: `C:\Program Files\Tesseract-OCR\tesseract.exe`
#### macOS
```bash
brew install tesseract
```
#### Linux (Ubuntu/Debian)
```bash
sudo apt update
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev
# Install language data packages as needed
sudo apt install tesseract-ocr-jpn tesseract-ocr-chi-sim tesseract-ocr-kor
```
## Installation
1. Clone the repository:
```bash
git clone https://github.com/yourusername/manga-ocr-translator.git
cd manga-ocr-translator
```
2. Install Python dependencies:
```bash
pip install -r requirements.txt
```
3. Set the Tesseract executable path in your environment (only if not in PATH):
```python
# In your environment or at the top of utils/ocr.py
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' # Windows example
```
## Usage
### Starting the Server
Run the application:
```bash
python app.py
```
Then open a web browser and navigate to:
```
http://localhost:8000
```
### Using the Web Interface
1. Access the main page at `http://localhost:8000`
2. Choose either URL or PDF upload method
3. Configure source and target languages
4. Submit for translation
5. View and download the translated images
### API Endpoints
The application also provides API endpoints for integration:
- **POST /translate/url** - Translate manga from a URL
- **POST /translate/pdf** - Translate manga from a PDF file
- **GET /api/info** - Get API information
## Development
### Project Structure
```
manga-ocr-translator/
β”œβ”€β”€ app.py # Main FastAPI application
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ static/ # Static files
β”‚ β”œβ”€β”€ translated/ # Translated images output
β”‚ └── ui/ # UI assets
β”‚ β”œβ”€β”€ index.html # Main UI page
β”‚ β”œβ”€β”€ styles.css # CSS styles
β”‚ └── scripts.js # JavaScript functionality
└── utils/ # Utility modules
β”œβ”€β”€ ocr.py # OCR functionality using pytesseract
β”œβ”€β”€ image.py # Image processing utilities
β”œβ”€β”€ pdf.py # PDF handling
β”œβ”€β”€ translation.py # Translation services
└── web.py # Web scraping utilities
```
## Configuration
You can configure the application by modifying the following files:
- **app.py**: Main application settings
- **utils/ocr.py**: OCR settings and language mapping
- **requirements.txt**: Package dependencies
## Troubleshooting
### Common Issues
1. **Tesseract not found**: Ensure Tesseract is installed and the correct path is set
2. **Missing language data**: Install additional language data packages for Tesseract
3. **Slow performance**: Consider using a GPU for faster processing
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Acknowledgements
- [PyTesseract](https://github.com/madmaze/pytesseract) for OCR capabilities
- [FastAPI](https://fastapi.tiangolo.com/) for the web framework
- [Pillow](https://pillow.readthedocs.io/) for image processing