Drag2121 commited on
Commit
a81573d
·
1 Parent(s): 7b25815
.dockerignore ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Git
2
+ .git
3
+ .gitignore
4
+
5
+ # Python
6
+ __pycache__/
7
+ *.py[cod]
8
+ *$py.class
9
+ *.so
10
+ .Python
11
+ build/
12
+ develop-eggs/
13
+ dist/
14
+ downloads/
15
+ eggs/
16
+ .eggs/
17
+ lib/
18
+ lib64/
19
+ parts/
20
+ sdist/
21
+ var/
22
+ wheels/
23
+ *.egg-info/
24
+ .installed.cfg
25
+ *.egg
26
+
27
+ # Virtual environment
28
+ venv/
29
+ env/
30
+ ENV/
31
+ fastapi_env/
32
+
33
+ # Editor directories and files
34
+ .idea/
35
+ .vscode/
36
+ *.swp
37
+ *.swo
38
+
39
+ # Environment variables
40
+ .env
41
+
42
+ # Static files (except the font directory)
43
+ static/translated/*
44
+ !static/translated/.gitkeep
45
+
46
+ # Logs
47
+ *.log
48
+
49
+ # OS specific
50
+ .DS_Store
51
+ Thumbs.db
Dockerfile ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.9-slim
2
+
3
+ # Create user with UID 1000 (required by HF Spaces)
4
+ RUN useradd -m -u 1000 user
5
+
6
+ # Set environment variables
7
+ ENV PATH="/home/user/.local/bin:$PATH"
8
+
9
+ # Install system dependencies
10
+ RUN apt-get update && apt-get install -y \
11
+ libgl1-mesa-glx \
12
+ libglib2.0-0 \
13
+ poppler-utils \
14
+ build-essential \
15
+ && rm -rf /var/lib/apt/lists/*
16
+
17
+ # Set working directory
18
+ WORKDIR /app
19
+
20
+ # Copy requirements and install as user
21
+ COPY --chown=user requirements.txt .
22
+
23
+ RUN pip install --no-cache-dir -r requirements.txt
24
+
25
+ # Copy rest of the app with correct permissions
26
+ COPY --chown=user . .
27
+
28
+ # Ensure static directories exist
29
+ RUN mkdir -p static/translated
30
+
31
+ # Switch to non-root user (required by HF Spaces)
32
+ USER user
33
+
34
+ # Expose port 7860 as required by HF Spaces
35
+ EXPOSE 7860
36
+
37
+ # Run the FastAPI app
38
+ CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
README.md CHANGED
@@ -1,10 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: Manga OCR
3
- emoji: 🐠
4
- colorFrom: gray
5
- colorTo: indigo
6
  sdk: docker
7
  pinned: false
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Manga OCR Translator API
2
+
3
+ A FastAPI-based API for translating manga images using OCR and machine translation. This application can process manga URLs or PDF files and translate text in speech bubbles using proximity-based speech bubble detection.
4
+
5
+ ## Features
6
+
7
+ - **Manga URL Translation**: Scrapes manga images from a URL and translates text in speech bubbles
8
+ - **PDF Translation**: Extracts pages from a PDF file and translates text in speech bubbles
9
+ - **Streaming Response**: Returns translated images as soon as they're processed
10
+ - **Multiple Translation Engines**: Supports Google Translate, MyMemory, Linguee, and Pollinations.ai
11
+ - **Multiple Languages**: Supports Japanese, Korean, Chinese, and more as source languages
12
+ - **Docker Support**: Easy deployment with Docker
13
+
14
+ ## API Endpoints
15
+
16
+ - **GET `/`**: Basic API information
17
+ - **POST `/translate/url`**: Translate manga from a URL
18
+ - **POST `/translate/pdf`**: Translate manga from a PDF file
19
+ - **GET `/docs`**: Swagger documentation
20
+
21
+ ## Running Locally
22
+
23
+ ### Prerequisites
24
+
25
+ - Python 3.9+
26
+ - Required system libraries for PDF processing (poppler-utils, libgl1-mesa-glx)
27
+
28
+ ### Installation
29
+
30
+ 1. Clone this repository
31
+ 2. Install dependencies:
32
+
33
+ ```bash
34
+ pip install -r requirements.txt
35
+ ```
36
+
37
+ 3. Make sure you have the font file in place:
38
+
39
+ ```
40
+ font/Movistar Text Regular.ttf
41
+ ```
42
+
43
+ 4. Run the server:
44
+
45
+ ```bash
46
+ uvicorn app:app --reload
47
+ ```
48
+
49
+ The API will be available at [http://localhost:8000](http://localhost:8000)
50
+
51
+ ## Docker
52
+
53
+ Build and run with Docker:
54
+
55
+ ```bash
56
+ # Build the image
57
+ docker build -t manga-ocr-api .
58
+
59
+ # Run the container
60
+ docker run -p 8000:8000 manga-ocr-api
61
+ ```
62
+
63
+ ## Deploying to Hugging Face Spaces
64
+
65
+ 1. Create a new Space on [Hugging Face Spaces](https://huggingface.co/spaces)
66
+ 2. Choose Docker as the Space SDK
67
+ 3. Upload this repository to your Space
68
+ 4. The container will be built and deployed automatically
69
+
70
+ ### Hugging Face Space Configuration
71
+
72
+ Create a `README.md` file in your Space with the following information:
73
+
74
+ ```markdown
75
  ---
76
+ title: Manga OCR Translator
77
+ emoji: 📚
78
+ colorFrom: blue
79
+ colorTo: purple
80
  sdk: docker
81
  pinned: false
82
  ---
83
 
84
+ # Manga OCR Translator
85
+
86
+ Translate manga images from URLs or PDF files using OCR and machine translation.
87
+ ```
88
+
89
+ ## Usage Examples
90
+
91
+ ### Translating from a URL
92
+
93
+ ```python
94
+ import requests
95
+ import json
96
+
97
+ url = "http://localhost:8000/translate/url"
98
+ payload = {
99
+ "url": "https://example.com/manga/chapter-1",
100
+ "src_lang": "ja",
101
+ "tgt_lang": "en",
102
+ "translator": "google"
103
+ }
104
+ headers = {"Content-Type": "application/json"}
105
+
106
+ response = requests.post(url, json=payload, headers=headers, stream=True)
107
+
108
+ for line in response.iter_lines():
109
+ if line:
110
+ # Process each line of the server-sent events
111
+ data = line.decode('utf-8').replace('data: ', '')
112
+ try:
113
+ result = json.loads(data)
114
+ print(result)
115
+ except json.JSONDecodeError:
116
+ pass
117
+ ```
118
+
119
+ ### Translating from a PDF
120
+
121
+ ```python
122
+ import requests
123
+ import json
124
+
125
+ url = "http://localhost:8000/translate/pdf"
126
+ files = {"file": open("manga.pdf", "rb")}
127
+ data = {
128
+ "src_lang": "ja",
129
+ "tgt_lang": "en",
130
+ "translator": "google"
131
+ }
132
+
133
+ response = requests.post(url, files=files, data=data, stream=True)
134
+
135
+ for line in response.iter_lines():
136
+ if line:
137
+ # Process each line of the server-sent events
138
+ data = line.decode('utf-8').replace('data: ', '')
139
+ try:
140
+ result = json.loads(data)
141
+ print(result)
142
+ except json.JSONDecodeError:
143
+ pass
144
+ ```
145
+
146
+ ## License
147
+
148
+ MIT
149
+
150
+ ## Acknowledgements
151
+
152
+ This project is based on the OCR and translation code from the original Gradio-based manga translator.
app.py ADDED
@@ -0,0 +1,292 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import io
3
+ import uuid
4
+ import asyncio
5
+ import aiohttp
6
+ import uvicorn
7
+ from typing import List, Dict, Any, Optional, Generator
8
+ from fastapi import FastAPI, UploadFile, File, Form, Query, BackgroundTasks
9
+ from fastapi.responses import StreamingResponse, JSONResponse, FileResponse
10
+ from fastapi.staticfiles import StaticFiles
11
+ from fastapi.middleware.cors import CORSMiddleware
12
+ from pydantic import BaseModel, Field, HttpUrl
13
+ import time
14
+ from PIL import Image
15
+
16
+ # Import utility functions
17
+ from utils.ocr import detect_text, group_text_regions
18
+ from utils.web import scrape_comic_images, download_image
19
+ from utils.pdf import pdf_to_images, pdf_stream_to_images
20
+ from utils.image import overlay_grouped_text, save_image
21
+ from utils.translation import translate_grouped_regions
22
+
23
+ # Configuration
24
+ STATIC_DIR = "static"
25
+ TRANSLATED_IMAGE_DIR = os.path.join(STATIC_DIR, "translated")
26
+ FONT_PATH = "font/Movistar Text Regular.ttf"
27
+
28
+ # Ensure directories exist
29
+ os.makedirs(TRANSLATED_IMAGE_DIR, exist_ok=True)
30
+
31
+ # Initialize FastAPI app
32
+ app = FastAPI(
33
+ title="Manga OCR Translator API",
34
+ description="API for translating manga images using OCR and machine translation",
35
+ version="1.0.0",
36
+ )
37
+
38
+ # Add CORS middleware
39
+ app.add_middleware(
40
+ CORSMiddleware,
41
+ allow_origins=["*"], # Allow all origins
42
+ allow_credentials=True,
43
+ allow_methods=["*"], # Allow all methods
44
+ allow_headers=["*"], # Allow all headers
45
+ )
46
+
47
+ # Mount static files directory
48
+ app.mount("/static", StaticFiles(directory=STATIC_DIR), name="static")
49
+
50
+ # Define request and response models
51
+ class TranslationRequest(BaseModel):
52
+ url: HttpUrl = Field(..., description="URL of the manga chapter to translate")
53
+ src_lang: str = Field(default="auto", description="Source language (auto, ja, ko, zh)")
54
+ tgt_lang: str = Field(default="en", description="Target language (en, es, fr, de, it, pt, ru)")
55
+ translator: str = Field(default="google", description="Translation engine (google, mymemory, linguee, pollinations)")
56
+
57
+ class TranslationResponse(BaseModel):
58
+ status: str
59
+ message: str
60
+ images: List[str] = []
61
+
62
+ # Basic homepage route
63
+ @app.get("/")
64
+ async def root():
65
+ return {
66
+ "message": "Welcome to Manga OCR Translator API",
67
+ "endpoints": {
68
+ "/translate/url": "Translate manga from a URL",
69
+ "/translate/pdf": "Translate manga from a PDF file",
70
+ "/docs": "API documentation"
71
+ }
72
+ }
73
+
74
+ # Route for processing manga URL with streaming response
75
+ @app.post("/translate/url")
76
+ async def translate_manga_url(request: TranslationRequest):
77
+ """
78
+ Process a manga URL and return translated images with streaming response.
79
+ Each image is processed and returned as soon as it's ready.
80
+ """
81
+ print(f"Received request to translate URL: {request.url}")
82
+
83
+ # Create a generator function that yields translated images
84
+ async def process_images():
85
+ try:
86
+ # Scrape image URLs from the manga page
87
+ image_urls = scrape_comic_images(str(request.url))
88
+ if not image_urls:
89
+ yield f"data: {{'status': 'error', 'message': 'No images found at the URL', 'images': []}}\n\n"
90
+ return
91
+
92
+ print(f"Found {len(image_urls)} images to process")
93
+
94
+ # Limit to first 5 images if too many
95
+ if len(image_urls) > 5:
96
+ print("Limiting to first 5 images")
97
+ image_urls = image_urls[:5]
98
+
99
+ # Process each image
100
+ for i, image_url in enumerate(image_urls):
101
+ try:
102
+ # Download image
103
+ print(f"Processing image {i+1}/{len(image_urls)}: {image_url}")
104
+
105
+ # Update client with status
106
+ yield f"data: {{'status': 'processing', 'message': 'Processing image {i+1}/{len(image_urls)}', 'image_url': '{image_url}'}}\n\n"
107
+
108
+ # Download the image
109
+ image_content = await download_image(image_url)
110
+ if not image_content:
111
+ print(f"Failed to download image {i+1}")
112
+ continue
113
+
114
+ # Detect text regions
115
+ text_regions = detect_text(image_content, request.src_lang)
116
+ if not text_regions:
117
+ print(f"No text detected in image {i+1}")
118
+ continue
119
+
120
+ # Group text regions
121
+ grouped_regions = group_text_regions(text_regions)
122
+ if not grouped_regions:
123
+ print(f"No text groups formed in image {i+1}")
124
+ continue
125
+
126
+ # Translate grouped regions
127
+ use_pollinations = request.translator == "pollinations"
128
+ free_translator = request.translator if not use_pollinations else "google"
129
+
130
+ translated_regions = translate_grouped_regions(
131
+ grouped_regions,
132
+ request.src_lang,
133
+ request.tgt_lang,
134
+ use_pollinations,
135
+ free_translator
136
+ )
137
+
138
+ # Overlay translated text on image
139
+ translated_image = overlay_grouped_text(image_content, translated_regions)
140
+
141
+ # Save image and get path
142
+ image_path = save_image(translated_image, TRANSLATED_IMAGE_DIR)
143
+
144
+ # Create a URL to the saved image
145
+ image_url = f"/static/translated/{os.path.basename(image_path)}"
146
+
147
+ # Stream the result back to the client
148
+ json_response = {
149
+ "status": "success",
150
+ "message": f"Processed image {i+1}/{len(image_urls)}",
151
+ "image_url": image_url
152
+ }
153
+
154
+ # Send this single image result
155
+ yield f"data: {json_response}\n\n"
156
+
157
+ except Exception as e:
158
+ print(f"Error processing image {i+1}: {e}")
159
+ yield f"data: {{'status': 'error', 'message': 'Error processing image {i+1}: {str(e)}'}}\n\n"
160
+
161
+ # Final message
162
+ yield f"data: {{'status': 'complete', 'message': 'All images processed'}}\n\n"
163
+
164
+ except Exception as e:
165
+ print(f"Error in process_images: {e}")
166
+ yield f"data: {{'status': 'error', 'message': 'Error: {str(e)}'}}\n\n"
167
+
168
+ # Return a streaming response
169
+ return StreamingResponse(
170
+ process_images(),
171
+ media_type="text/event-stream",
172
+ headers={
173
+ "Cache-Control": "no-cache",
174
+ "Connection": "keep-alive",
175
+ "X-Accel-Buffering": "no" # Disable buffering for Nginx
176
+ }
177
+ )
178
+
179
+ # Route for processing PDF file with streaming response
180
+ @app.post("/translate/pdf")
181
+ async def translate_manga_pdf(
182
+ file: UploadFile = File(...),
183
+ src_lang: str = Form("auto"),
184
+ tgt_lang: str = Form("en"),
185
+ translator: str = Form("google")
186
+ ):
187
+ """
188
+ Process a manga PDF file and return translated images with streaming response.
189
+ Each image is processed and returned as soon as it's ready.
190
+ """
191
+ print(f"Received PDF file: {file.filename}, size: {file.size} bytes")
192
+
193
+ # Create a generator function that yields translated images
194
+ async def process_pdf():
195
+ try:
196
+ # Read the PDF file
197
+ pdf_content = await file.read()
198
+
199
+ # Convert PDF to images
200
+ yield f"data: {{'status': 'processing', 'message': 'Converting PDF to images...'}}\n\n"
201
+
202
+ # Convert PDF to images in memory
203
+ pdf_images = await pdf_stream_to_images(pdf_content)
204
+
205
+ if not pdf_images:
206
+ yield f"data: {{'status': 'error', 'message': 'Failed to extract images from PDF', 'images': []}}\n\n"
207
+ return
208
+
209
+ print(f"Extracted {len(pdf_images)} pages from PDF")
210
+
211
+ # Limit to first 5 images if too many
212
+ if len(pdf_images) > 5:
213
+ print("Limiting to first 5 pages")
214
+ pdf_images = pdf_images[:5]
215
+
216
+ # Process each image
217
+ for i, image_content in enumerate(pdf_images):
218
+ try:
219
+ # Update client with status
220
+ print(f"Processing PDF page {i+1}/{len(pdf_images)}")
221
+ yield f"data: {{'status': 'processing', 'message': 'Processing PDF page {i+1}/{len(pdf_images)}'}}\n\n"
222
+
223
+ # Detect text regions
224
+ text_regions = detect_text(image_content, src_lang)
225
+ if not text_regions:
226
+ print(f"No text detected in PDF page {i+1}")
227
+ continue
228
+
229
+ # Group text regions
230
+ grouped_regions = group_text_regions(text_regions)
231
+ if not grouped_regions:
232
+ print(f"No text groups formed in PDF page {i+1}")
233
+ continue
234
+
235
+ # Translate grouped regions
236
+ use_pollinations = translator == "pollinations"
237
+ free_translator = translator if not use_pollinations else "google"
238
+
239
+ translated_regions = translate_grouped_regions(
240
+ grouped_regions,
241
+ src_lang,
242
+ tgt_lang,
243
+ use_pollinations,
244
+ free_translator
245
+ )
246
+
247
+ # Overlay translated text on image
248
+ pil_image = Image.open(io.BytesIO(image_content))
249
+ translated_image = overlay_grouped_text(image_content, translated_regions)
250
+
251
+ # Save image and get path
252
+ image_path = save_image(translated_image, TRANSLATED_IMAGE_DIR)
253
+
254
+ # Create a URL to the saved image
255
+ image_url = f"/static/translated/{os.path.basename(image_path)}"
256
+
257
+ # Stream the result back to the client
258
+ json_response = {
259
+ "status": "success",
260
+ "message": f"Processed PDF page {i+1}/{len(pdf_images)}",
261
+ "image_url": image_url
262
+ }
263
+
264
+ # Send this single image result
265
+ yield f"data: {json_response}\n\n"
266
+
267
+ except Exception as e:
268
+ print(f"Error processing PDF page {i+1}: {e}")
269
+ yield f"data: {{'status': 'error', 'message': 'Error processing PDF page {i+1}: {str(e)}'}}\n\n"
270
+
271
+ # Final message
272
+ yield f"data: {{'status': 'complete', 'message': 'All PDF pages processed'}}\n\n"
273
+
274
+ except Exception as e:
275
+ print(f"Error in process_pdf: {e}")
276
+ yield f"data: {{'status': 'error', 'message': 'Error: {str(e)}'}}\n\n"
277
+
278
+ # Return a streaming response
279
+ return StreamingResponse(
280
+ process_pdf(),
281
+ media_type="text/event-stream",
282
+ headers={
283
+ "Cache-Control": "no-cache",
284
+ "Connection": "keep-alive",
285
+ "X-Accel-Buffering": "no" # Disable buffering for Nginx
286
+ }
287
+ )
288
+
289
+ # Main entry point
290
+ if __name__ == "__main__":
291
+ print("Starting Manga OCR Translator API server...")
292
+ uvicorn.run("app:app", host="0.0.0.0", port=8000, reload=True)
client_example.py ADDED
@@ -0,0 +1,188 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python
2
+ """
3
+ Client example for Manga OCR Translator API
4
+ Demonstrates how to use the API with streaming responses
5
+ """
6
+
7
+ import sys
8
+ import requests
9
+ import json
10
+ import argparse
11
+ import webbrowser
12
+ from urllib.parse import urljoin
13
+ from pprint import pprint
14
+
15
+ # Debug print statement
16
+ print("Initializing Manga OCR Translator API client...")
17
+
18
+ def process_url(api_url, manga_url, src_lang, tgt_lang, translator):
19
+ """Process a manga URL and display results with streaming."""
20
+ endpoint = urljoin(api_url, "translate/url")
21
+
22
+ print(f"Translating manga from URL: {manga_url}")
23
+ print(f"Source language: {src_lang}, Target language: {tgt_lang}")
24
+ print(f"Using translator: {translator}")
25
+
26
+ payload = {
27
+ "url": manga_url,
28
+ "src_lang": src_lang,
29
+ "tgt_lang": tgt_lang,
30
+ "translator": translator
31
+ }
32
+ headers = {"Content-Type": "application/json"}
33
+
34
+ # Make the API request with streaming enabled
35
+ print("\nSending request to API...\n")
36
+ try:
37
+ response = requests.post(endpoint, json=payload, headers=headers, stream=True)
38
+ response.raise_for_status() # Raise exception for 4XX/5XX status codes
39
+
40
+ # Process the streaming response
41
+ image_urls = []
42
+ for line in response.iter_lines():
43
+ if line:
44
+ # Process each line of the server-sent events
45
+ data = line.decode('utf-8').replace('data: ', '')
46
+ try:
47
+ # Parse the JSON data
48
+ result = json.loads(data)
49
+
50
+ # Print status update
51
+ if "status" in result:
52
+ status = result["status"]
53
+ message = result.get("message", "")
54
+ print(f"[{status.upper()}] {message}")
55
+
56
+ # Save image URL if available
57
+ if "image_url" in result:
58
+ image_url = urljoin(api_url, result["image_url"])
59
+ image_urls.append(image_url)
60
+ print(f"Image available at: {image_url}")
61
+
62
+ # Open the first image in a browser
63
+ if len(image_urls) == 1:
64
+ print("Opening first image in browser...")
65
+ webbrowser.open(image_url)
66
+
67
+ except json.JSONDecodeError:
68
+ print(f"Warning: Received non-JSON data: {data}")
69
+
70
+ print("\nProcessing complete.")
71
+ print(f"Total images processed: {len(image_urls)}")
72
+
73
+ return image_urls
74
+
75
+ except requests.exceptions.RequestException as e:
76
+ print(f"Error: Failed to connect to API: {e}")
77
+ return []
78
+
79
+ def process_pdf(api_url, pdf_path, src_lang, tgt_lang, translator):
80
+ """Process a manga PDF and display results with streaming."""
81
+ endpoint = urljoin(api_url, "translate/pdf")
82
+
83
+ print(f"Translating manga from PDF: {pdf_path}")
84
+ print(f"Source language: {src_lang}, Target language: {tgt_lang}")
85
+ print(f"Using translator: {translator}")
86
+
87
+ # Prepare files and data for multipart form
88
+ try:
89
+ files = {"file": open(pdf_path, "rb")}
90
+ except FileNotFoundError:
91
+ print(f"Error: PDF file not found at path: {pdf_path}")
92
+ return []
93
+
94
+ data = {
95
+ "src_lang": src_lang,
96
+ "tgt_lang": tgt_lang,
97
+ "translator": translator
98
+ }
99
+
100
+ # Make the API request with streaming enabled
101
+ print("\nSending request to API...\n")
102
+ try:
103
+ response = requests.post(endpoint, files=files, data=data, stream=True)
104
+ response.raise_for_status() # Raise exception for 4XX/5XX status codes
105
+
106
+ # Process the streaming response
107
+ image_urls = []
108
+ for line in response.iter_lines():
109
+ if line:
110
+ # Process each line of the server-sent events
111
+ data = line.decode('utf-8').replace('data: ', '')
112
+ try:
113
+ # Parse the JSON data
114
+ result = json.loads(data)
115
+
116
+ # Print status update
117
+ if "status" in result:
118
+ status = result["status"]
119
+ message = result.get("message", "")
120
+ print(f"[{status.upper()}] {message}")
121
+
122
+ # Save image URL if available
123
+ if "image_url" in result:
124
+ image_url = urljoin(api_url, result["image_url"])
125
+ image_urls.append(image_url)
126
+ print(f"Image available at: {image_url}")
127
+
128
+ # Open the first image in a browser
129
+ if len(image_urls) == 1:
130
+ print("Opening first image in browser...")
131
+ webbrowser.open(image_url)
132
+
133
+ except json.JSONDecodeError:
134
+ print(f"Warning: Received non-JSON data: {data}")
135
+
136
+ print("\nProcessing complete.")
137
+ print(f"Total images processed: {len(image_urls)}")
138
+
139
+ return image_urls
140
+
141
+ except requests.exceptions.RequestException as e:
142
+ print(f"Error: Failed to connect to API: {e}")
143
+ return []
144
+ finally:
145
+ # Close the file
146
+ files["file"].close()
147
+
148
+ def main():
149
+ # Parse command line arguments
150
+ parser = argparse.ArgumentParser(description="Manga OCR Translator Client")
151
+ parser.add_argument("--api-url", default="http://localhost:8000", help="API URL")
152
+
153
+ # Create subparsers for URL and PDF commands
154
+ subparsers = parser.add_subparsers(dest="command", help="Command to run")
155
+
156
+ # URL command
157
+ url_parser = subparsers.add_parser("url", help="Translate manga from URL")
158
+ url_parser.add_argument("manga_url", help="URL of manga chapter to translate")
159
+ url_parser.add_argument("--src-lang", default="auto", help="Source language (auto, ja, ko, zh)")
160
+ url_parser.add_argument("--tgt-lang", default="en", help="Target language (en, es, fr, etc.)")
161
+ url_parser.add_argument("--translator", default="google",
162
+ help="Translation engine (google, mymemory, linguee, pollinations)")
163
+
164
+ # PDF command
165
+ pdf_parser = subparsers.add_parser("pdf", help="Translate manga from PDF")
166
+ pdf_parser.add_argument("pdf_path", help="Path to PDF file")
167
+ pdf_parser.add_argument("--src-lang", default="auto", help="Source language (auto, ja, ko, zh)")
168
+ pdf_parser.add_argument("--tgt-lang", default="en", help="Target language (en, es, fr, etc.)")
169
+ pdf_parser.add_argument("--translator", default="google",
170
+ help="Translation engine (google, mymemory, linguee, pollinations)")
171
+
172
+ args = parser.parse_args()
173
+
174
+ # Debug print args
175
+ print("Debug: Command line arguments:", args)
176
+
177
+ # Process based on command
178
+ if args.command == "url":
179
+ process_url(args.api_url, args.manga_url, args.src_lang, args.tgt_lang, args.translator)
180
+ elif args.command == "pdf":
181
+ process_pdf(args.api_url, args.pdf_path, args.src_lang, args.tgt_lang, args.translator)
182
+ else:
183
+ print("Error: Please specify a command (url or pdf)")
184
+ parser.print_help()
185
+ sys.exit(1)
186
+
187
+ if __name__ == "__main__":
188
+ main()
font/Movistar Text Regular.ttf ADDED
Binary file (57.3 kB). View file
 
huggingface-space-metadata.md ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Manga OCR Translator
3
+ emoji: 📚
4
+ colorFrom: indigo
5
+ colorTo: purple
6
+ sdk: docker
7
+ app_port: 8000
8
+ pinned: false
9
+ license: mit
10
+ ---
11
+
12
+ # Manga OCR Translator
13
+
14
+ Translate manga images from URLs or PDF files using OCR and machine translation. The API returns each translated image as soon as it's processed, without waiting for the entire batch to complete.
15
+
16
+ ## Features
17
+
18
+ - Manga URL Translation
19
+ - PDF Translation
20
+ - Multiple Translation Engines
21
+ - Streaming Response
22
+ - Multiple Languages Support
23
+
24
+ ## API Endpoints
25
+
26
+ - `/translate/url` - Translate manga from a URL
27
+ - `/translate/pdf` - Translate manga from a PDF file
28
+ - `/docs` - API documentation
requirements.txt ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # FastAPI and web server
2
+ fastapi
3
+ uvicorn
4
+ python-multipart
5
+
6
+ # OCR dependencies
7
+ easyocr==1.7.0
8
+ Pillow==10.0.1
9
+ numpy==1.25.2
10
+
11
+ # Web scraping and HTTP
12
+ requests==2.31.0
13
+ beautifulsoup4==4.12.2
14
+
15
+ # Translation
16
+ deep-translator==1.11.4
17
+
18
+ # PDF processing
19
+ pdf2image==1.16.3
20
+ pymupdf==1.23.3 # For PDF processing (alternative to pdf2image)
21
+
22
+ # Utilities
23
+ python-dotenv==1.0.0
24
+ uuid==1.30
static/translated/.gitkeep ADDED
@@ -0,0 +1 @@
 
 
1
+
utils/__init__.py ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ # Import main utility functions for easier access
2
+ from .ocr import detect_text, group_text_regions
3
+ from .web import scrape_comic_images, download_image
4
+ from .pdf import pdf_to_images, pdf_stream_to_images
5
+ from .image import overlay_grouped_text, save_image
6
+ from .translation import translate_grouped_regions, translate_with_free_translator, translate_with_pollinations
7
+
8
+ # Debug print
9
+ print("Initialized utils package.")
utils/image.py ADDED
@@ -0,0 +1,254 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import io
3
+ import uuid
4
+ from PIL import Image, ImageDraw, ImageFont
5
+ from typing import List, Dict, Any, Optional, BinaryIO
6
+
7
+ # Font configuration
8
+ FONT_PATH = "font/Movistar Text Regular.ttf" # This should be available in the Docker image
9
+
10
+ # Debug print
11
+ print("Loading image processing module...")
12
+
13
+ def calculate_font_size(text: str, max_width: float, max_height: float, font_path: str = FONT_PATH) -> int:
14
+ """Calculates a suitable font size to fit text within a bounding box."""
15
+ font_size = int(max_height * 0.8) # Start with a reasonable guess
16
+ if font_size <= 0:
17
+ return 1 # Minimum font size
18
+
19
+ try:
20
+ font = ImageFont.truetype(font_path, font_size)
21
+ text_bbox = font.getbbox(text)
22
+ text_width = text_bbox[2] - text_bbox[0]
23
+ text_height = text_bbox[3] - text_bbox[1]
24
+
25
+ # Reduce font size until it fits (simplified approach)
26
+ while (text_width > max_width or text_height > max_height) and font_size > 5:
27
+ font_size -= 1
28
+ font = ImageFont.truetype(font_path, font_size)
29
+ text_bbox = font.getbbox(text)
30
+ text_width = text_bbox[2] - text_bbox[0]
31
+ text_height = text_bbox[3] - text_bbox[1]
32
+
33
+ return max(font_size, 5) # Ensure a minimum size
34
+ except IOError:
35
+ print(f"Warning: Font file not found at {font_path}. Using default PIL font.")
36
+ # Fallback logic if font file is not found
37
+ return int(max_height * 0.5) # Simplified fallback
38
+ except Exception as e:
39
+ print(f"Error calculating font size: {e}")
40
+ return int(max_height * 0.5) # Simplified fallback
41
+
42
+ def overlay_grouped_text(image_content: bytes, translated_grouped_regions: List[dict]) -> Image.Image:
43
+ """Overlay translated grouped text onto the original image and return the PIL Image."""
44
+ try:
45
+ # Debug print
46
+ print("Overlaying translated text on image...")
47
+
48
+ image = Image.open(io.BytesIO(image_content)).convert("RGBA")
49
+ draw = ImageDraw.Draw(image)
50
+
51
+ # Sort regions by area (smallest to largest) to ensure smaller bubbles are processed later
52
+ # This helps with overlapping bubbles, as smaller ones often appear on top
53
+ sorted_regions = sorted(translated_grouped_regions,
54
+ key=lambda r: (r.get("bbox")[2][0] - r.get("bbox")[0][0]) *
55
+ (r.get("bbox")[2][1] - r.get("bbox")[0][1])
56
+ if r.get("bbox") else 0)
57
+
58
+ for group in sorted_regions:
59
+ if "translated_text" not in group or not group.get("is_group", False):
60
+ print("Skipping non-group or untranslated region in overlay.")
61
+ continue
62
+
63
+ group_bbox_corners = group["bbox"] # This is the combined bbox for the group
64
+ translated_text = group["translated_text"]
65
+
66
+ # Extract combined bounding box coordinates [x1, y1, x2, y2]
67
+ x1, y1 = group_bbox_corners[0] # Top-left
68
+ x2, y2 = group_bbox_corners[2] # Bottom-right
69
+
70
+ # Basic validation
71
+ if x1 >= x2 or y1 >= y2:
72
+ print(f"Warning: Degenerate group bbox found: {group_bbox_corners}. Skipping group.")
73
+ continue
74
+
75
+ width, height = x2 - x1, y2 - y1
76
+ if width <= 0 or height <= 0:
77
+ print(f"Warning: Non-positive dimensions for group bbox: {group_bbox_corners}. Skipping group.")
78
+ continue
79
+
80
+ # --- Background Clearing ---
81
+ # Apply a more generous padding to ensure no text from other bubbles bleeds in
82
+ padding = max(10, int(min(width, height) * 0.1)) # Increased padding for better erasure
83
+
84
+ # For more complete text removal, we'll clear both the group bounding box and each original region
85
+
86
+ # 1. First clear the entire group bounding box with padding
87
+ for px in range(int(x1 - padding), int(x2 + padding + 1)):
88
+ for py in range(int(y1 - padding), int(y2 + padding + 1)):
89
+ if 0 <= px < image.width and 0 <= py < image.height:
90
+ image.putpixel((px, py), (255, 255, 255, 255)) # White background
91
+
92
+ # 2. For more thorough clearing, also clear each original region with its own padding
93
+ # This helps ensure we catch text that might be outside the main group bbox
94
+ if "original_regions" in group:
95
+ for orig_region in group["original_regions"]:
96
+ orig_bbox = orig_region["bbox"]
97
+ orig_x1, orig_y1 = orig_bbox[0]
98
+ orig_x2, orig_y2 = orig_bbox[2]
99
+ # Add extra padding specifically for original regions
100
+ region_padding = max(8, int(min(orig_x2 - orig_x1, orig_y2 - orig_y1) * 0.15))
101
+
102
+ # Clear each original region with its own padding
103
+ for px in range(int(orig_x1 - region_padding), int(orig_x2 + region_padding + 1)):
104
+ for py in range(int(orig_y1 - region_padding), int(orig_y2 + region_padding + 1)):
105
+ if 0 <= px < image.width and 0 <= py < image.height:
106
+ image.putpixel((px, py), (255, 255, 255, 255)) # White background
107
+
108
+ print(f"Cleared background for text region and {len(group.get('original_regions', []))} original regions")
109
+
110
+ # --- Font Calculation with Wrapping Logic ---
111
+ # Get an initial font size estimate
112
+ initial_font_size = calculate_font_size(translated_text, width, height, FONT_PATH)
113
+ try:
114
+ font = ImageFont.truetype(FONT_PATH, initial_font_size)
115
+ except Exception as e:
116
+ print(f"Error loading font size {initial_font_size}: {e}. Using default.")
117
+ try:
118
+ font = ImageFont.load_default()
119
+ except Exception as font_e:
120
+ print(f"Error loading default font: {font_e}. Cannot draw text.")
121
+ continue
122
+
123
+ # Calculate effective drawing area (with reduced width for better aesthetics)
124
+ effective_width = width * 0.9 # Reduce slightly to avoid text touching edges
125
+ effective_height = height * 0.9
126
+
127
+ # Determine if text needs wrapping
128
+ text_lines = []
129
+ words = translated_text.split()
130
+ current_line = words[0] if words else ""
131
+
132
+ # Simple word wrapping algorithm
133
+ for word in words[1:]:
134
+ test_line = current_line + " " + word
135
+ # Use getbbox for more accurate width calculation during wrapping check
136
+ line_bbox_wrap = font.getbbox(test_line)
137
+ line_width_wrap = line_bbox_wrap[2] - line_bbox_wrap[0]
138
+
139
+ if line_width_wrap <= effective_width:
140
+ current_line = test_line
141
+ else:
142
+ text_lines.append(current_line)
143
+ current_line = word
144
+
145
+ # Add the last line
146
+ if current_line:
147
+ text_lines.append(current_line)
148
+
149
+ # If no lines were created (empty text), skip
150
+ if not text_lines:
151
+ continue
152
+
153
+ # --- Font Calculation & Line Height (Robust Spacing) ---
154
+ # Use getbbox for line height calculation based on a reference string
155
+ line_bbox_ref = font.getbbox("Tg")
156
+ line_height_metric = line_bbox_ref[3] - line_bbox_ref[1] # Height of the bbox
157
+ # Increase spacing significantly - force separation
158
+ line_spacing_factor = 2.0
159
+ line_height = line_height_metric * line_spacing_factor
160
+ print(f"Using bbox height for metric: {line_height_metric:.2f}, Aggressive Line Height ({line_spacing_factor}x): {line_height:.2f}")
161
+
162
+ # Approximate total height for resizing check
163
+ total_text_height_check = line_height * len(text_lines)
164
+
165
+ # If wrapped text is too tall, recalculate font size
166
+ if total_text_height_check > effective_height:
167
+ print(f"Resizing font: Estimated wrapped height ({total_text_height_check:.1f}) > effective height ({effective_height:.1f})")
168
+ scale_factor = effective_height / total_text_height_check
169
+ new_font_size = max(6, int(initial_font_size * scale_factor)) # Min size 6pt
170
+ print(f"Original font size: {initial_font_size}, New font size: {new_font_size}")
171
+ try:
172
+ font = ImageFont.truetype(FONT_PATH, new_font_size)
173
+ # Recalculate line height metric and line height with new font
174
+ line_bbox_ref = font.getbbox("Tg")
175
+ line_height_metric = line_bbox_ref[3] - line_bbox_ref[1]
176
+ line_height = line_height_metric * line_spacing_factor # Apply same spacing factor
177
+ print(f"Recalculated Aggressive Line Height after resize: {line_height:.2f}")
178
+ except Exception as e:
179
+ print(f"Error loading adjusted font: {e}")
180
+
181
+ # Final font decided. Get its metrics if needed elsewhere, but height is set.
182
+ print(f"Final line height for drawing: {line_height:.2f}")
183
+
184
+ # --- Draw Text (Robust Top-Left Stacking) ---
185
+ try:
186
+ # Calculate vertical starting position for the *top* of the first line
187
+ total_drawn_height = line_height * len(text_lines) # Total height including full spacing for all lines
188
+ start_y_top = y1 + (height - total_drawn_height) / 2
189
+ print(f"Drawing text block: Total Height={total_drawn_height:.1f}, Start Top Y={start_y_top:.1f}")
190
+
191
+ # Draw each line using top-left anchor and explicit vertical step
192
+ for i, line in enumerate(text_lines):
193
+ # Use getlength for precise width if possible
194
+ try:
195
+ line_width = font.getlength(line)
196
+ except AttributeError:
197
+ line_bbox_draw = font.getbbox(line, anchor="lt") # Use top-left anchor for bbox width
198
+ line_width = line_bbox_draw[2] - line_bbox_draw[0]
199
+
200
+ draw_x = x1 + (width - line_width) / 2
201
+ # Position the *top* of the current line
202
+ draw_y_top = start_y_top + (i * line_height)
203
+
204
+ print(f" Drawing line {i+1}/{len(text_lines)}: '{line}' at Top-Left ({draw_x:.1f}, {draw_y_top:.1f}) Width={line_width:.1f}")
205
+
206
+ # Basic bounds check for top-left corner
207
+ draw_x = max(padding, min(image.width - padding - line_width, draw_x))
208
+ draw_y_top = max(padding, min(image.height - padding - line_height_metric, draw_y_top)) # Check against metric height
209
+
210
+ # Draw using Pillow's stroke feature with top-left anchor
211
+ stroke_width = max(1, int(initial_font_size * 0.08))
212
+ draw.text(
213
+ (draw_x, draw_y_top),
214
+ line,
215
+ font=font,
216
+ fill="black",
217
+ anchor="lt", # Use top-left anchor
218
+ stroke_width=stroke_width,
219
+ stroke_fill="white"
220
+ )
221
+
222
+ print(f"Drew wrapped text ({len(text_lines)} lines) in bbox [{x1:.0f},{y1:.0f} - {x2:.0f},{y2:.0f}]")
223
+
224
+ except Exception as draw_e:
225
+ print(f"Error drawing text: {draw_e}")
226
+
227
+ # Debug statement to confirm processing is complete
228
+ print(f"Overlay complete. Processed {len(sorted_regions)} regions.")
229
+ return image
230
+
231
+ except Exception as e:
232
+ print(f"Error during image overlay: {e}")
233
+ import traceback
234
+ traceback.print_exc()
235
+ # Return original image in case of error
236
+ return Image.open(io.BytesIO(image_content))
237
+
238
+ def save_image(image: Image.Image, output_dir: str = "static/translated") -> str:
239
+ """Save the image to the specified directory and return the path."""
240
+ os.makedirs(output_dir, exist_ok=True)
241
+
242
+ # Generate a unique filename
243
+ filename = f"{uuid.uuid4()}.png"
244
+ filepath = os.path.join(output_dir, filename)
245
+
246
+ # Convert to RGB if the image is in RGBA mode
247
+ if image.mode == "RGBA":
248
+ image = image.convert("RGB")
249
+
250
+ # Save the image
251
+ image.save(filepath)
252
+ print(f"Saved translated image to {filepath}")
253
+
254
+ return filepath
utils/ocr.py ADDED
@@ -0,0 +1,181 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import io
2
+ import os
3
+ import numpy as np
4
+ import math
5
+ from PIL import Image
6
+ import easyocr
7
+ from typing import List, Dict, Any, Optional, Tuple
8
+
9
+ # Debug print
10
+ print("Loading OCR module...")
11
+
12
+ def detect_text(image_content: bytes, language: str) -> List[dict]:
13
+ """Detect text regions in the image using OCR."""
14
+ try:
15
+ # Debug print
16
+ print(f"Processing image for OCR, language: {language}")
17
+
18
+ image = Image.open(io.BytesIO(image_content)).convert("RGB")
19
+ image_np = np.array(image)
20
+
21
+ # Initialize OCR reader
22
+ # Use 'ch_sim' for Simplified Chinese instead of 'zh'
23
+ lang_list = [lang.strip() for lang in language.split(',')] if language != "auto" else ['ko', 'ch_sim', 'ja', 'en']
24
+ print(f"Initializing EasyOCR with languages: {lang_list}")
25
+ reader = easyocr.Reader(lang_list, gpu=False) # Specify gpu=False if no GPU or CUDA issues
26
+
27
+ # Detect text
28
+ results = reader.readtext(image_np, detail=1, paragraph=False) # Process line by line
29
+
30
+ # Process results
31
+ text_regions = []
32
+ for bbox, text, conf in results:
33
+ # bbox is [[x1,y1],[x2,y1],[x2,y2],[x1,y2]]
34
+ # Ensure bbox coordinates are standard Python numbers
35
+ bbox_float = [[float(p[0]), float(p[1])] for p in bbox]
36
+ if conf > 0.3: # Confidence threshold (adjust as needed)
37
+ text_regions.append({
38
+ "bbox": bbox_float,
39
+ "text": text,
40
+ "confidence": float(conf)
41
+ })
42
+ print(f"Detected {len(text_regions)} text regions.")
43
+ return text_regions
44
+ except Exception as e:
45
+ print(f"Error during OCR detection: {e}")
46
+ return []
47
+
48
+ # Rectangle utility functions for speech bubble detection
49
+ def rect_distance(rect1, rect2):
50
+ """Calculate the distance between two rectangles (bounding boxes)"""
51
+ # Convert from [[x1,y1],[x2,y1],[x2,y2],[x1,y2]] format to [x1,y1,x2,y2]
52
+ r1 = [rect1[0][0], rect1[0][1], rect1[2][0], rect1[2][1]]
53
+ r2 = [rect2[0][0], rect2[0][1], rect2[2][0], rect2[2][1]]
54
+
55
+ # Check for overlap
56
+ if (r1[0] <= r2[2] and r2[0] <= r1[2] and r1[1] <= r2[3] and r2[1] <= r1[3]):
57
+ return 0 # Rectangles overlap
58
+
59
+ # Calculate distances
60
+ dx = max(0, max(r1[0], r2[0]) - min(r1[2], r2[2]))
61
+ dy = max(0, max(r1[1], r2[1]) - min(r1[3], r2[3]))
62
+
63
+ # Return Euclidean distance
64
+ return math.sqrt(dx*dx + dy*dy)
65
+
66
+ def rect_center(rect):
67
+ """Calculate the center point of a rectangle"""
68
+ # Convert from [[x1,y1],[x2,y1],[x2,y2],[x1,y2]] format
69
+ x1, y1 = rect[0]
70
+ x2, y2 = rect[2]
71
+ return [(x1 + x2) / 2, (y1 + y2) / 2]
72
+
73
+ def rect_contains_point(rect, point):
74
+ """Check if a rectangle contains a point"""
75
+ # Convert from [[x1,y1],[x2,y1],[x2,y2],[x1,y2]] format
76
+ x1, y1 = rect[0]
77
+ x2, y2 = rect[2]
78
+ px, py = point
79
+ return x1 <= px <= x2 and y1 <= py <= y2
80
+
81
+ def expand_rect(rect1, rect2):
82
+ """Create a new rectangle that encompasses both input rectangles"""
83
+ # Convert from [[x1,y1],[x2,y1],[x2,y2],[x1,y2]] format
84
+ x1_1, y1_1 = rect1[0]
85
+ x2_1, y2_1 = rect1[2]
86
+ x1_2, y1_2 = rect2[0]
87
+ x2_2, y2_2 = rect2[2]
88
+
89
+ # Find the min/max coordinates
90
+ x1 = min(x1_1, x1_2)
91
+ y1 = min(y1_1, y1_2)
92
+ x2 = max(x2_1, x2_2)
93
+ y2 = max(y2_1, y2_2)
94
+
95
+ # Return in the format [[x1,y1],[x2,y1],[x2,y2],[x1,y2]]
96
+ return [[x1, y1], [x2, y1], [x2, y2], [x1, y2]]
97
+
98
+ def is_valid_rect(rect):
99
+ """Validate rectangle properties"""
100
+ # Convert from [[x1,y1],[x2,y1],[x2,y2],[x1,y2]] format
101
+ x1, y1 = rect[0]
102
+ x2, y2 = rect[2]
103
+
104
+ # Check if width and height are positive
105
+ return x2 > x1 and y2 > y1
106
+
107
+ def group_text_regions(regions: List[Dict], proximity_threshold: float = 100.0) -> List[Dict]:
108
+ """Groups text regions based on proximity and overlap to identify speech bubbles."""
109
+ if not regions:
110
+ return []
111
+
112
+ # Extract bounding boxes from regions
113
+ bboxes = [region['bbox'] for region in regions]
114
+
115
+ # Dictionary to track which rectangles have been grouped
116
+ grouped = [False] * len(bboxes)
117
+
118
+ # List to store the grouped rectangles
119
+ grouped_boxes = []
120
+
121
+ # Group rectangles based on proximity and overlap
122
+ for i in range(len(bboxes)):
123
+ if grouped[i]:
124
+ continue # Skip if already grouped
125
+
126
+ # Start a new group with this rectangle
127
+ current_group = bboxes[i]
128
+ grouped[i] = True
129
+
130
+ # Flag to check if we made any changes in this pass
131
+ made_changes = True
132
+
133
+ # Keep expanding the group until no more changes
134
+ while made_changes:
135
+ made_changes = False
136
+
137
+ for j in range(len(bboxes)):
138
+ if grouped[j]:
139
+ continue # Skip if already grouped
140
+
141
+ # Check if this rectangle should be added to the current group
142
+ if rect_distance(current_group, bboxes[j]) < proximity_threshold:
143
+ # Expand the current group to include this rectangle
144
+ current_group = expand_rect(current_group, bboxes[j])
145
+ grouped[j] = True
146
+ made_changes = True
147
+
148
+ # Add the final group to our list if it's valid
149
+ if is_valid_rect(current_group):
150
+ grouped_boxes.append(current_group)
151
+
152
+ # Now combine text from all regions within each group
153
+ result_groups = []
154
+
155
+ for group_bbox in grouped_boxes:
156
+ # Find all regions whose center is within this group
157
+ group_regions = []
158
+ group_text = ""
159
+
160
+ for region in regions:
161
+ center = rect_center(region['bbox'])
162
+ if rect_contains_point(group_bbox, center):
163
+ group_regions.append(region)
164
+ # Add space between text fragments
165
+ if group_text:
166
+ group_text += " "
167
+ group_text += region['text']
168
+
169
+ # Create the grouped region
170
+ if group_regions:
171
+ result_groups.append({
172
+ "text": group_text,
173
+ "bbox": group_bbox,
174
+ "original_regions": group_regions,
175
+ "is_group": True
176
+ })
177
+
178
+ # Debug output
179
+ print(f"Proximity-based grouping: {len(regions)} individual regions into {len(result_groups)} speech bubbles.")
180
+
181
+ return result_groups
utils/pdf.py ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import io
3
+ import uuid
4
+ import tempfile
5
+ from typing import List, Optional, Dict, Any, BinaryIO
6
+ import fitz # PyMuPDF
7
+ from PIL import Image
8
+ import numpy as np
9
+
10
+ # Debug print
11
+ print("Loading PDF processing module...")
12
+
13
+ async def pdf_to_images(pdf_file: BinaryIO, output_dir: str = None) -> List[str]:
14
+ """
15
+ Convert PDF file to a list of image paths.
16
+
17
+ Args:
18
+ pdf_file: File-like object containing PDF data
19
+ output_dir: Directory to save the images (optional)
20
+
21
+ Returns:
22
+ List of image file paths
23
+ """
24
+ # If no output directory is provided, use a temporary directory
25
+ if output_dir is None:
26
+ temp_dir = tempfile.mkdtemp()
27
+ output_dir = temp_dir
28
+ print(f"Using temporary directory for PDF images: {temp_dir}")
29
+ else:
30
+ os.makedirs(output_dir, exist_ok=True)
31
+ print(f"Using provided directory for PDF images: {output_dir}")
32
+
33
+ try:
34
+ # Debug print
35
+ print("Processing PDF file...")
36
+
37
+ # Read PDF file content
38
+ pdf_data = pdf_file.read()
39
+
40
+ # Create a unique subfolder for this PDF to avoid name collisions
41
+ pdf_id = uuid.uuid4().hex[:8]
42
+ pdf_output_dir = os.path.join(output_dir, f"pdf_{pdf_id}")
43
+ os.makedirs(pdf_output_dir, exist_ok=True)
44
+
45
+ # Open PDF document with PyMuPDF
46
+ pdf_document = fitz.open(stream=pdf_data, filetype="pdf")
47
+ image_paths = []
48
+
49
+ for page_num in range(len(pdf_document)):
50
+ # Get the page
51
+ page = pdf_document.load_page(page_num)
52
+
53
+ # Render page to an image (adjust the matrix for higher resolution if needed)
54
+ # Default DPI is 72, so matrix=4 gives 288 DPI
55
+ pix = page.get_pixmap(matrix=fitz.Matrix(2, 2))
56
+
57
+ # Save the image
58
+ image_path = os.path.join(pdf_output_dir, f"page_{page_num+1}.png")
59
+ pix.save(image_path)
60
+ image_paths.append(image_path)
61
+ print(f"Saved PDF page {page_num+1} to {image_path}")
62
+
63
+ pdf_document.close()
64
+ return image_paths
65
+
66
+ except Exception as e:
67
+ print(f"Error converting PDF to images: {e}")
68
+ return []
69
+
70
+ async def pdf_stream_to_images(pdf_stream: bytes) -> List[bytes]:
71
+ """
72
+ Convert PDF binary data to a list of image binary data.
73
+ Useful for processing PDFs in memory without saving to disk.
74
+
75
+ Args:
76
+ pdf_stream: PDF file binary data
77
+
78
+ Returns:
79
+ List of image binary data (bytes)
80
+ """
81
+ try:
82
+ # Debug print
83
+ print("Processing PDF stream in memory...")
84
+
85
+ # Open PDF document from binary data
86
+ pdf_document = fitz.open(stream=pdf_stream, filetype="pdf")
87
+ images_data = []
88
+
89
+ for page_num in range(len(pdf_document)):
90
+ # Get the page
91
+ page = pdf_document.load_page(page_num)
92
+
93
+ # Render page to an image with 2x resolution (adjust as needed)
94
+ pix = page.get_pixmap(matrix=fitz.Matrix(2, 2))
95
+
96
+ # Convert to PIL Image and then to bytes
97
+ img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
98
+ img_bytes = io.BytesIO()
99
+ img.save(img_bytes, format="PNG")
100
+ images_data.append(img_bytes.getvalue())
101
+ print(f"Processed PDF page {page_num+1} in memory")
102
+
103
+ pdf_document.close()
104
+ return images_data
105
+
106
+ except Exception as e:
107
+ print(f"Error converting PDF stream to images: {e}")
108
+ return []
utils/translation.py ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import requests
3
+ from typing import List, Dict, Any, Optional
4
+ from deep_translator import GoogleTranslator, MyMemoryTranslator, LingueeTranslator
5
+
6
+ # Debug print
7
+ print("Loading translation module...")
8
+
9
+ # Default free translator
10
+ DEFAULT_FREE_TRANSLATOR = "google"
11
+
12
+ def translate_with_free_translator(texts: List[str], src_lang: str, target_lang: str,
13
+ translator_type: str = DEFAULT_FREE_TRANSLATOR) -> List[Dict[str, str]]:
14
+ """Translate texts using available free translation APIs."""
15
+ if not texts:
16
+ return []
17
+
18
+ # Debug info
19
+ print(f"Translating {len(texts)} texts using {translator_type} translator")
20
+ print(f"Source language: {src_lang}, Target language: {target_lang}")
21
+
22
+ # Standardize language codes for different services
23
+ lang_map = {
24
+ # ISO-639 language code mapping for various services
25
+ "auto": "auto",
26
+ "en": "en",
27
+ "zh": "zh-CN",
28
+ "ja": "ja",
29
+ "ko": "ko",
30
+ "es": "es",
31
+ "fr": "fr",
32
+ "de": "de",
33
+ "it": "it",
34
+ "pt": "pt",
35
+ "ru": "ru"
36
+ }
37
+
38
+ # Map to standardized language codes if available, otherwise use as-is
39
+ std_src_lang = lang_map.get(src_lang, src_lang)
40
+ std_target_lang = lang_map.get(target_lang, target_lang)
41
+
42
+ translated_results = []
43
+
44
+ try:
45
+ # Select translator based on specified type
46
+ if translator_type == "google":
47
+ # Google Translate (free tier without API key)
48
+ translator = GoogleTranslator(source=std_src_lang if std_src_lang != "auto" else "auto",
49
+ target=std_target_lang)
50
+
51
+ for text in texts:
52
+ if not text or len(text.strip()) < 2:
53
+ translated_results.append({"original": text, "translated": text})
54
+ continue
55
+
56
+ try:
57
+ translated = translator.translate(text)
58
+ translated_results.append({
59
+ "original": text,
60
+ "translated": translated or text # Fallback to original if None
61
+ })
62
+ print(f"Translated: '{text}' -> '{translated}'")
63
+ except Exception as e:
64
+ print(f"Error translating text '{text}': {e}")
65
+ translated_results.append({"original": text, "translated": text})
66
+
67
+ elif translator_type == "mymemory":
68
+ # MyMemory (free with limits)
69
+ translator = MyMemoryTranslator(source=std_src_lang if std_src_lang != "auto" else "auto",
70
+ target=std_target_lang)
71
+
72
+ for text in texts:
73
+ if not text or len(text.strip()) < 2:
74
+ translated_results.append({"original": text, "translated": text})
75
+ continue
76
+
77
+ try:
78
+ translated = translator.translate(text)
79
+ translated_results.append({
80
+ "original": text,
81
+ "translated": translated or text
82
+ })
83
+ print(f"Translated: '{text}' -> '{translated}'")
84
+ except Exception as e:
85
+ print(f"Error translating text '{text}': {e}")
86
+ translated_results.append({"original": text, "translated": text})
87
+
88
+ elif translator_type == "linguee":
89
+ # Linguee (free)
90
+ # Note: Linguee has limited language support
91
+ try:
92
+ translator = LingueeTranslator(source=std_src_lang, target=std_target_lang)
93
+
94
+ for text in texts:
95
+ if not text or len(text.strip()) < 2:
96
+ translated_results.append({"original": text, "translated": text})
97
+ continue
98
+
99
+ try:
100
+ translated = translator.translate(text)
101
+ translated_results.append({
102
+ "original": text,
103
+ "translated": translated or text
104
+ })
105
+ print(f"Translated: '{text}' -> '{translated}'")
106
+ except Exception as e:
107
+ print(f"Error translating text '{text}': {e}")
108
+ translated_results.append({"original": text, "translated": text})
109
+ except Exception as e:
110
+ print(f"Linguee translator error: {e}. Falling back to Google Translate.")
111
+ # Fallback to Google Translate
112
+ return translate_with_free_translator(texts, src_lang, target_lang, "google")
113
+
114
+ else:
115
+ # Default fallback to Google
116
+ print(f"Unknown translator type '{translator_type}', using Google Translate as fallback")
117
+ return translate_with_free_translator(texts, src_lang, target_lang, "google")
118
+
119
+ except Exception as e:
120
+ print(f"Error setting up translator: {e}")
121
+ # Return original texts if translation fails
122
+ for text in texts:
123
+ translated_results.append({"original": text, "translated": text})
124
+
125
+ return translated_results
126
+
127
+ def translate_with_pollinations(texts: List[str], src_lang: str, target_lang: str) -> List[Dict[str, str]]:
128
+ """Translate texts using Pollinations.ai API."""
129
+ if not texts:
130
+ return []
131
+
132
+ try:
133
+ # Convert language codes to what Pollinations expects
134
+ lang_map = {
135
+ "zh": "zh-CN",
136
+ "ko": "ko",
137
+ "ja": "ja",
138
+ "en": "en",
139
+ "auto": "auto"
140
+ }
141
+
142
+ # Map our language codes to Pollinations expected codes
143
+ src_lang_mapped = lang_map.get(src_lang, src_lang)
144
+ target_lang_mapped = lang_map.get(target_lang, target_lang)
145
+
146
+ # Preparing batch of at least 10 texts for translation
147
+ batch_texts = texts.copy()
148
+ while len(batch_texts) < 10:
149
+ batch_texts.extend(texts[:min(len(texts), 10-len(batch_texts))])
150
+
151
+ # Prepare the system prompt for the translation task
152
+ system_prompt = f"You are a professional translator. Translate the following texts from {src_lang_mapped} to {target_lang_mapped}. Preserve the meaning, tone, and style of the original text. Return the results in JSON format with 'original' and 'translated' keys for each text."
153
+
154
+ # Create the user prompt with the texts to translate
155
+ user_prompt = "Translate these texts and return a JSON array with objects containing 'original' and 'translated' properties:\n"
156
+ for i, text in enumerate(batch_texts):
157
+ user_prompt += f"{i+1}. {text}\n"
158
+
159
+ # Prepare the API request to Pollinations.ai
160
+ api_url = "https://api.pollinations.ai/v2/generate/text"
161
+ headers = {
162
+ "Content-Type": "application/json"
163
+ }
164
+
165
+ payload = {
166
+ "model": "openai", # Using OpenAI model as it's good for translation
167
+ "prompt": user_prompt,
168
+ "system": system_prompt,
169
+ "jsonMode": True, # Request JSON output
170
+ "reasoning_effort": "high", # Higher quality translations
171
+ "private": True,
172
+ "referrer": "manga_ocr_translator"
173
+ }
174
+
175
+ print(f"Sending batch of {len(batch_texts)} texts to Pollinations.ai for translation")
176
+ response = requests.post(api_url, headers=headers, json=payload, timeout=60)
177
+ response.raise_for_status()
178
+
179
+ # Parse the response
180
+ result = response.json()
181
+ translated_text = result.get("response", "")
182
+
183
+ # The response should be a JSON string that we need to parse
184
+ try:
185
+ translated_data = json.loads(translated_text)
186
+
187
+ # Map the translation results back to the original texts
188
+ # Create a mapping of original text to its translation
189
+ translation_map = {}
190
+ for item in translated_data:
191
+ if isinstance(item, dict) and "original" in item and "translated" in item:
192
+ translation_map[item["original"]] = item["translated"]
193
+
194
+ # Apply translations to our original texts list
195
+ translated_results = []
196
+ for text in texts:
197
+ translated_results.append({
198
+ "original": text,
199
+ "translated": translation_map.get(text, text) # Default to original if not found
200
+ })
201
+ print(f"Pollinations translation: '{text}' -> '{translation_map.get(text, text)}'")
202
+
203
+ return translated_results
204
+
205
+ except json.JSONDecodeError as e:
206
+ print(f"Error parsing translation response as JSON: {e}")
207
+ print(f"Raw response: {translated_text}")
208
+ # Fallback: Return original texts
209
+ return [{"original": text, "translated": text} for text in texts]
210
+
211
+ except Exception as e:
212
+ print(f"Error with Pollinations.ai translation: {e}")
213
+ # Return original texts as fallback
214
+ return [{"original": text, "translated": text} for text in texts]
215
+
216
+ def translate_grouped_regions(grouped_regions: List[Dict], src_lang: str, target_lang: str, use_pollinations: bool = False,
217
+ free_translator: str = DEFAULT_FREE_TRANSLATOR) -> List[Dict]:
218
+ """Translate text within grouped regions."""
219
+ if not grouped_regions:
220
+ return []
221
+
222
+ # Add translated_text to all regions with original text as a fallback
223
+ for region in grouped_regions:
224
+ region["translated_text"] = region["text"] # Default fallback for the group
225
+
226
+ # Extract all texts (already grouped) for translation
227
+ texts_to_translate = [region["text"] for region in grouped_regions if region["text"] and len(region["text"].strip()) >= 2]
228
+
229
+ if not texts_to_translate:
230
+ print("No valid grouped texts to translate")
231
+ return grouped_regions # Return groups with original text as fallback
232
+
233
+ try:
234
+ print(f"Translating {len(texts_to_translate)} grouped texts from '{src_lang}' to '{target_lang}'...")
235
+
236
+ translation_results = []
237
+ # Use Pollinations.ai for translation if enabled
238
+ if use_pollinations:
239
+ print("Using Pollinations.ai for translation")
240
+ translation_results = translate_with_pollinations(texts_to_translate, src_lang, target_lang)
241
+
242
+ # Otherwise, use selected free translator
243
+ else:
244
+ print(f"Using free translator: {free_translator}")
245
+ translation_results = translate_with_free_translator(
246
+ texts_to_translate,
247
+ src_lang,
248
+ target_lang,
249
+ free_translator
250
+ )
251
+
252
+ # Create a dictionary mapping original grouped text to translated text
253
+ # Ensure the results match the input order
254
+ translations_dict = {item["original"]: item["translated"] for item in translation_results}
255
+
256
+ # Apply translations back to the grouped regions
257
+ for region in grouped_regions:
258
+ original_text = region["text"]
259
+ if original_text in translations_dict:
260
+ region["translated_text"] = translations_dict[original_text]
261
+ print(f" Applied translation to group: '{original_text}' -> '{region['translated_text']}'")
262
+ else:
263
+ print(f" Warning: Translation not found for group text: '{original_text}'") # Should not happen if results map correctly
264
+
265
+ return grouped_regions
266
+
267
+ except Exception as e:
268
+ print(f"Error during grouped translation setup: {e}")
269
+ # Fallback already handled by setting original text
270
+ return grouped_regions
utils/web.py ADDED
@@ -0,0 +1,78 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import requests
2
+ import os
3
+ from typing import List, Dict, Any, Optional
4
+ from bs4 import BeautifulSoup
5
+
6
+ # Debug print
7
+ print("Loading web scraping module...")
8
+
9
+ def scrape_comic_images(url: str) -> List[str]:
10
+ """Scrape all comic images from the provided URL."""
11
+ headers = {
12
+ "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"
13
+ }
14
+ try:
15
+ # Debug print
16
+ print(f"Scraping manga images from URL: {url}")
17
+
18
+ response = requests.get(url, headers=headers, timeout=15)
19
+ response.raise_for_status() # Raise an exception for bad status codes
20
+
21
+ soup = BeautifulSoup(response.content, "html.parser")
22
+ images = []
23
+ image_urls = set() # Use a set to avoid duplicate URLs
24
+
25
+ # Common selectors for manhwa/manhua sites
26
+ selectors = [
27
+ ".chapter-content img",
28
+ ".comic-container img",
29
+ ".reading-content img",
30
+ "#readerarea img",
31
+ ".viewer-container img",
32
+ "img.comic-panel"
33
+ ]
34
+
35
+ for selector in selectors:
36
+ for img in soup.select(selector):
37
+ src = img.get("src") or img.get("data-src") or img.get("data-original")
38
+ if src:
39
+ # Resolve relative URLs
40
+ src = requests.compat.urljoin(url, src.strip())
41
+ if src not in image_urls:
42
+ images.append(src)
43
+ image_urls.add(src)
44
+
45
+ if not images:
46
+ # Fallback: Find all images if specific selectors fail
47
+ print("Warning: Specific selectors failed, trying to find all images.")
48
+ for img in soup.find_all("img"):
49
+ src = img.get("src") or img.get("data-src") or img.get("data-original")
50
+ if src:
51
+ src = requests.compat.urljoin(url, src.strip())
52
+ if src not in image_urls:
53
+ images.append(src)
54
+ image_urls.add(src)
55
+
56
+ print(f"Found {len(images)} manga images.")
57
+ if not images:
58
+ raise ValueError("Could not find any images on the page using common selectors.")
59
+
60
+ return images
61
+
62
+ except Exception as e:
63
+ print(f"Error scraping comic images: {e}")
64
+ return []
65
+
66
+ async def download_image(image_url: str) -> Optional[bytes]:
67
+ """Download an image from the provided URL."""
68
+ headers = {
69
+ "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"
70
+ }
71
+ try:
72
+ print(f"Downloading image: {image_url}")
73
+ response = requests.get(image_url, headers=headers, timeout=15)
74
+ response.raise_for_status()
75
+ return response.content
76
+ except Exception as e:
77
+ print(f"Error downloading image {image_url}: {e}")
78
+ return None