Spaces:

Dragneel
/

Manga_OCR

Running

App Files Files Community

Manga_OCR / README.md

Drag2121

tesseract

a91e387 5 months ago

preview code

raw

history blame contribute delete

4.34 kB

	---
	title: My FastAPI App
	emoji: 🚀
	colorFrom: blue
	colorTo: purple
	sdk: docker
	sdk_version: "0.1.0"
	app_file: app.py
	pinned: false
	---

	# Manga OCR Translator

	A powerful tool for translating manga, comics, and light novels using OCR (Optical Character Recognition) and machine translation.

	## Features

	- Text Detection: Automatically detects text in manga images using PyTesseract
	- Text Translation: Translates detected text to your preferred language
	- PDF Support: Upload and translate entire PDF manga files
	- URL Support: Translate manga directly from web URLs
	- Modern UI: Clean, responsive Gumroad-like user interface
	- Multiple Languages: Support for Japanese, Korean, Chinese, and more

	## Requirements

	Before running the application, ensure you have the following installed:

	- Python 3.8+
	- [Tesseract OCR](https://github.com/tesseract-ocr/tesseract) - Must be installed and available in PATH
	- Required Python packages from `requirements.txt`

	### Installing Tesseract OCR

	#### Windows
	1. Download the installer from [UB Mannheim](https://github.com/UB-Mannheim/tesseract/wiki)
	2. Install and add to PATH environment variable
	3. Default installation path: `C:\Program Files\Tesseract-OCR\tesseract.exe`

	#### macOS
	```bash
	brew install tesseract
	```

	#### Linux (Ubuntu/Debian)
	```bash
	sudo apt update
	sudo apt install tesseract-ocr
	sudo apt install libtesseract-dev
	# Install language data packages as needed
	sudo apt install tesseract-ocr-jpn tesseract-ocr-chi-sim tesseract-ocr-kor
	```

	## Installation

	1. Clone the repository:
	```bash
	git clone https://github.com/yourusername/manga-ocr-translator.git
	cd manga-ocr-translator
	```

	2. Install Python dependencies:
	```bash
	pip install -r requirements.txt
	```

	3. Set the Tesseract executable path in your environment (only if not in PATH):
	```python
	# In your environment or at the top of utils/ocr.py
	import pytesseract
	pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' # Windows example
	```

	## Usage

	### Starting the Server

	Run the application:
	```bash
	python app.py
	```

	Then open a web browser and navigate to:
	```
	http://localhost:8000
	```

	### Using the Web Interface

	1. Access the main page at `http://localhost:8000`
	2. Choose either URL or PDF upload method
	3. Configure source and target languages
	4. Submit for translation
	5. View and download the translated images

	### API Endpoints

	The application also provides API endpoints for integration:

	- POST /translate/url - Translate manga from a URL
	- POST /translate/pdf - Translate manga from a PDF file
	- GET /api/info - Get API information

	## Development

	### Project Structure

	```
	manga-ocr-translator/
	├── app.py # Main FastAPI application
	├── requirements.txt # Python dependencies
	├── static/ # Static files
	│ ├── translated/ # Translated images output
	│ └── ui/ # UI assets
	│ ├── index.html # Main UI page
	│ ├── styles.css # CSS styles
	│ └── scripts.js # JavaScript functionality
	└── utils/ # Utility modules
	├── ocr.py # OCR functionality using pytesseract
	├── image.py # Image processing utilities
	├── pdf.py # PDF handling
	├── translation.py # Translation services
	└── web.py # Web scraping utilities
	```

	## Configuration

	You can configure the application by modifying the following files:

	- app.py: Main application settings
	- utils/ocr.py: OCR settings and language mapping
	- requirements.txt: Package dependencies

	## Troubleshooting

	### Common Issues

	1. Tesseract not found: Ensure Tesseract is installed and the correct path is set
	2. Missing language data: Install additional language data packages for Tesseract
	3. Slow performance: Consider using a GPU for faster processing

	## License

	This project is licensed under the MIT License - see the LICENSE file for details.

	## Acknowledgements

	- [PyTesseract](https://github.com/madmaze/pytesseract) for OCR capabilities
	- [FastAPI](https://fastapi.tiangolo.com/) for the web framework
	- [Pillow](https://pillow.readthedocs.io/) for image processing