avans06 commited on
Commit
79914f7
·
1 Parent(s): 5bd05cf

init commit.

Browse files
.gitignore ADDED
@@ -0,0 +1,156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .vs
2
+ .vscode
3
+ # Byte-compiled / optimized / DLL files
4
+ __pycache__/
5
+ *.py[cod]
6
+ *$py.class
7
+
8
+ # C extensions
9
+ *.so
10
+
11
+ # Distribution / packaging
12
+ .Python
13
+ build/
14
+ develop-eggs/
15
+ dist/
16
+ downloads/
17
+ eggs/
18
+ .eggs/
19
+ lib/
20
+ lib64/
21
+ parts/
22
+ sdist/
23
+ var/
24
+ wheels/
25
+ share/python-wheels/
26
+ *.egg-info/
27
+ .installed.cfg
28
+ *.egg
29
+ MANIFEST
30
+
31
+ # PyInstaller
32
+ # Usually these files are written by a python script from a template
33
+ # before PyInstaller builds the exe, so as to inject date/other infos into it.
34
+ *.manifest
35
+ *.spec
36
+
37
+ # Installer logs
38
+ pip-log.txt
39
+ pip-delete-this-directory.txt
40
+
41
+ # Unit test / coverage reports
42
+ htmlcov/
43
+ .tox/
44
+ .nox/
45
+ .coverage
46
+ .coverage.*
47
+ .cache
48
+ nosetests.xml
49
+ coverage.xml
50
+ *.cover
51
+ *.py,cover
52
+ .hypothesis/
53
+ .pytest_cache/
54
+ cover/
55
+
56
+ # Translations
57
+ *.mo
58
+ *.pot
59
+
60
+ # Django stuff:
61
+ *.log
62
+ local_settings.py
63
+ db.sqlite3
64
+ db.sqlite3-journal
65
+
66
+ # Flask stuff:
67
+ instance/
68
+ .webassets-cache
69
+
70
+ # Scrapy stuff:
71
+ .scrapy
72
+
73
+ # Sphinx documentation
74
+ docs/_build/
75
+
76
+ # PyBuilder
77
+ .pybuilder/
78
+ target/
79
+
80
+ # Jupyter Notebook
81
+ .ipynb_checkpoints
82
+
83
+ # IPython
84
+ profile_default/
85
+ ipython_config.py
86
+
87
+ # pyenv
88
+ # For a library or package, you might want to ignore these files since the code is
89
+ # intended to run in multiple environments; otherwise, check them in:
90
+ # .python-version
91
+
92
+ # pipenv
93
+ # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
94
+ # However, in case of collaboration, if having platform-specific dependencies or dependencies
95
+ # having no cross-platform support, pipenv may install dependencies that don't work, or not
96
+ # install all needed dependencies.
97
+ #Pipfile.lock
98
+
99
+ # poetry
100
+ # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
101
+ # This is especially recommended for binary packages to ensure reproducibility, and is more
102
+ # commonly ignored for libraries.
103
+ # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
104
+ #poetry.lock
105
+
106
+ # PEP 582; used by e.g. github.com/David-OConnor/pyflow
107
+ __pypackages__/
108
+
109
+ # Celery stuff
110
+ celerybeat-schedule
111
+ celerybeat.pid
112
+
113
+ # SageMath parsed files
114
+ *.sage.py
115
+
116
+ # Environments
117
+ .env
118
+ .venv
119
+ env/
120
+ venv/
121
+ ENV/
122
+ env.bak/
123
+ venv.bak/
124
+
125
+ # Spyder project settings
126
+ .spyderproject
127
+ .spyproject
128
+
129
+ # Rope project settings
130
+ .ropeproject
131
+
132
+ # mkdocs documentation
133
+ /site
134
+
135
+ # mypy
136
+ .mypy_cache/
137
+ .dmypy.json
138
+ dmypy.json
139
+
140
+ # Pyre type checker
141
+ .pyre/
142
+
143
+ # pytype static type analyzer
144
+ .pytype/
145
+
146
+ # Cython debug symbols
147
+ cython_debug/
148
+
149
+ # PyCharm
150
+ # JetBrains specific template is maintainted in a separate JetBrains.gitignore that can
151
+ # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
152
+ # and can be added to the global gitignore or merged into this file. For a more nuclear
153
+ # option (not recommended) you can uncomment the following to ignore the entire idea folder.
154
+ #.idea/
155
+
156
+ tmp/
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2023 Eren
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md CHANGED
@@ -1,8 +1,8 @@
1
  ---
2
  title: AdenzuMangaPanelExtractor
3
- emoji: 📉
4
- colorFrom: red
5
- colorTo: pink
6
  sdk: gradio
7
  sdk_version: 5.35.0
8
  app_file: app.py
@@ -11,3 +11,99 @@ short_description: Adenzu Manga/Comics Panel Extractor (WebUI)
11
  ---
12
 
13
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: AdenzuMangaPanelExtractor
3
+ emoji: 🕮
4
+ colorFrom: green
5
+ colorTo: indigo
6
  sdk: gradio
7
  sdk_version: 5.35.0
8
  app_file: app.py
 
11
  ---
12
 
13
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
14
+
15
+ ___
16
+
17
+ # Adenzu Manga/Comics Panel Extractor (WebUI)
18
+
19
+ Upload your manga or comic book images. This tool will automatically analyze the panels on each page,
20
+ crop them into individual image files, and package them into a single ZIP file for you to download.
21
+
22
+ The core panel detection is created by **adenzu** ([Original Project](https://github.com/adenzu/Manga-Panel-Extractor)).
23
+
24
+ ___
25
+
26
+ # Manga Panel Extractor
27
+
28
+ A simple program that takes manga pages and outputs panels on them. The current way of working of this program was inspired from this [paper](related-paper.pdf). Thanks to Xufang Pang, Ying Cao, Rynson W.H. Lau, and Antoni B. Chan for their work.
29
+
30
+ Please read [report](reports/internal/report.pdf) for detailed explanation of the implemented algorithm(s).
31
+
32
+ - Please note that this program is designed specifically for manga and may not work correctly with manhwas or other similar formats.
33
+
34
+ ## Installation
35
+
36
+ Visit the [Releases](https://github.com/adenzu/Manga-Panel-Extractor/releases) section of the repository and download the executable for Windows. Currently, only the Windows7+ versions are supported. However, you can build for other platforms using the provided source code.
37
+
38
+ ## Usage
39
+
40
+ ### Website
41
+ 1. Go to the website https://adenzu.github.io/Manga-Panel-Extractor/
42
+ 2. Select image files of manga pages you want to extract panels of
43
+ 3. Click `Start`
44
+ 4. Click `Cancel` to cancel panel extraction process whenever you want to
45
+ 5. Click `Download` to download a zip file containing all the panels
46
+
47
+ - **Note:** Too many images may cause computer lag.
48
+
49
+ ### Executable
50
+
51
+ 0. You can check examples in advance to see if this program can help you with your images, see examples [here](tests/data/test_performance/README.md#what-it-does).
52
+ 1. [Download the latest executable](https://github.com/adenzu/Manga-Panel-Extractor/releases/latest) for your operating system.
53
+ 2. Execute the downloaded executable.
54
+ 3. Select the input directory containing the manga page images. Each image should represent one manga page.
55
+ 4. Choose the output directory where the extracted panels will be saved.
56
+ 5. You can check the checkbox named "Split Joint Panels" to split joint panels. **This slows down the process up to ten fold.**
57
+ 6. You can check the checkbox named "Fallback" for fallback method to be applied in case of a failed extraction.
58
+ 7. You can check the checkbox named "Output to Separate Folders" to extract each image's panels into their respective folder.
59
+ 8. Click "Start" to initiate the panel extraction process. You can monitor the progress in the bottom left corner of the program window.
60
+ 9. To cancel the process, click "Cancel".
61
+
62
+ - Please note that this program is designed specifically for manga and may not work correctly with manhwas or other similar formats.
63
+
64
+ ### CLI - Input and Output Directories
65
+
66
+ ```bash
67
+ python main.py [input_dir] [output_dir] [-s] [-f] [-g]
68
+ ```
69
+
70
+ or
71
+
72
+ ```bash
73
+ python main.py [input_img_path] [-s] [-f] [-g]
74
+ ```
75
+
76
+ - `[input_img_path]`: Input image path.
77
+ - `[input_dir]`: Input directory.
78
+ - `[output_dir]` (optional): Output directory.
79
+ - `-s` or `--split-joint-panels` (optional): Split joint panels.
80
+ - `-f` or `--fallback` (optional): Fallback to a more aggressive method if the first one fails.
81
+ - `-g` or `--gui` (optional): Use GUI.
82
+
83
+ ## Program Explanation and Examples
84
+
85
+ See explanation and examples [here](tests/data/test_performance/README.md).
86
+
87
+ ## Features
88
+
89
+ The key feature of Manga Panel Extractor is its ability to analyze manga pages and extract panels from them.
90
+
91
+ ## Configuration
92
+
93
+ Manga Panel Extractor does not require any additional configuration. It is ready to use out of the box.
94
+
95
+ ## Contributing
96
+
97
+ Currently, there is limited community involvement in this project. Feel free to contribute by submitting bug reports or feature requests through the [Issues](https://github.com/adenzu/Manga-Panel-Extractor/issues) section.
98
+
99
+ ## License
100
+
101
+ This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
102
+
103
+ ## Troubleshooting
104
+
105
+ - If the extraction process is unsuccessful, the output image may resemble the input manga page image itself.
106
+
107
+ ## Contact
108
+
109
+ If you have any questions, issues, or suggestions, please open an issue in the [Issues](https://github.com/adenzu/Manga-Panel-Extractor/issues) section.
ai-models/2024-11-00/best.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b5e5004af64c6c18d98cec26804cb7ca4f7ee1b1fe44f7aa385a4cc025e31f34
3
+ size 14367976
app.py ADDED
@@ -0,0 +1,233 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ adenzu Eren Manga/Comics Panel Extractor (WebUI)
3
+ Copyright (C) 2025 avan
4
+
5
+ This program is a web interface for the adenzu library.
6
+ The core logic is based on adenzu Eren, the Manga Panel Extractor.
7
+ Copyright (c) 2023 Eren
8
+ """
9
+ import gradio as gr
10
+ import os
11
+ import cv2
12
+ import numpy as np
13
+ import tempfile
14
+ import shutil
15
+ from tqdm import tqdm
16
+
17
+ from image_processing.panel import generate_panel_blocks, generate_panel_blocks_by_ai
18
+
19
+ # --- UI Description ---
20
+ DESCRIPTION = """
21
+ # adenzu Eren Manga/Comics Panel Extractor (WebUI)
22
+ Upload your manga or comic book images. This tool will automatically analyze and extract each panel.
23
+ You can choose between a Traditional algorithm or a AI-based model for processing.
24
+ Finally, all extracted panels are packaged into a single ZIP file for you to download.
25
+
26
+ The Core package author: **adenzu Eren** ([Original Project](https://github.com/adenzu/Manga-Panel-Extractor)).
27
+ """
28
+
29
+ def process_images(
30
+ input_files,
31
+ method,
32
+ separate_folders,
33
+ # Traditional method params
34
+ merge_mode,
35
+ split_joint,
36
+ fallback,
37
+ output_mode,
38
+ # AI method params (currently only merge_mode is used)
39
+ # This structure allows for future AI-specific params
40
+ progress=gr.Progress(track_tqdm=True)
41
+ ):
42
+ """
43
+ Main processing function called by Gradio.
44
+ It takes uploaded files and settings, processes them, and returns a zip file.
45
+ """
46
+ if not input_files:
47
+ raise gr.Error("No images uploaded. Please upload at least one image.")
48
+
49
+ # Create a temporary directory to store the processed panels
50
+ with tempfile.TemporaryDirectory() as temp_panel_dir:
51
+ print(f"Created temporary directory for panels: {temp_panel_dir}")
52
+
53
+ for image_file in tqdm(input_files, desc="Processing Images"):
54
+ try:
55
+ # The image_file object from gr.Files has a .name attribute with the temp path
56
+ original_filename = os.path.basename(image_file.name)
57
+ filename_no_ext, file_ext = os.path.splitext(original_filename)
58
+
59
+ # Read the image using OpenCV
60
+ image = cv2.imread(image_file.name)
61
+ if image is None:
62
+ print(f"Warning: Could not read image {original_filename}. Skipping.")
63
+ continue
64
+
65
+ # Select the processing function based on the chosen method
66
+ if method == "Traditional":
67
+ panel_blocks = generate_panel_blocks(
68
+ image=image,
69
+ split_joint_panels=split_joint,
70
+ fallback=fallback,
71
+ mode=output_mode,
72
+ merge=merge_mode
73
+ )
74
+ elif method == "AI":
75
+ panel_blocks = generate_panel_blocks_by_ai(
76
+ image=image,
77
+ merge=merge_mode
78
+ )
79
+ else:
80
+ # Should not happen with Radio button selection
81
+ panel_blocks = []
82
+
83
+ if not panel_blocks:
84
+ print(f"Warning: No panels found in {original_filename}.")
85
+ continue
86
+
87
+ # Determine the output path for the panels of this image
88
+ if separate_folders:
89
+ # Create a sub-directory for each image
90
+ image_output_folder = os.path.join(temp_panel_dir, filename_no_ext)
91
+ os.makedirs(image_output_folder, exist_ok=True)
92
+ else:
93
+ # Output all panels to the root of the temp directory
94
+ image_output_folder = temp_panel_dir
95
+
96
+ # Save each panel block
97
+ for i, panel in enumerate(panel_blocks):
98
+ if separate_folders:
99
+ # e.g., /tmp/xyz/image1/panel_0.png
100
+ panel_filename = f"panel_{i}{file_ext if file_ext else '.png'}"
101
+ else:
102
+ # e.g., /tmp/xyz/image1_panel_0.png
103
+ panel_filename = f"{filename_no_ext}_panel_{i}{file_ext if file_ext else '.png'}"
104
+
105
+ output_path = os.path.join(image_output_folder, panel_filename)
106
+ cv2.imwrite(output_path, panel)
107
+
108
+ except Exception as e:
109
+ print(f"Error processing {original_filename}: {e}")
110
+ raise gr.Error(f"Failed to process {original_filename}: {e}")
111
+
112
+ # After processing all images, check if any panels were generated
113
+ if not os.listdir(temp_panel_dir):
114
+ raise gr.Error("Processing complete, but no panels were extracted from any of the images.")
115
+
116
+ # --- Create a zip file ---
117
+
118
+ # Create a separate temporary directory to hold the final zip file.
119
+ # Gradio will handle cleaning this up after serving the file to the user.
120
+ zip_output_dir = tempfile.mkdtemp()
121
+
122
+ # Define the base name for our archive (path + filename without extension)
123
+ zip_path_base = os.path.join(zip_output_dir, "adenzu_output")
124
+
125
+ # Create the zip file. shutil.make_archive will add the '.zip' extension.
126
+ # The first argument is the full path for the output file (minus extension).
127
+ # The third argument is the directory to be zipped.
128
+ final_zip_path = shutil.make_archive(
129
+ base_name=zip_path_base,
130
+ format='zip',
131
+ root_dir=temp_panel_dir
132
+ )
133
+
134
+ print(f"Created ZIP file at: {final_zip_path}")
135
+ # The function returns the full path to the created zip file.
136
+ # Gradio takes this path and provides it as a download link.
137
+ return final_zip_path
138
+
139
+
140
+ def main():
141
+ """
142
+ Defines and launches the Gradio interface.
143
+ """
144
+ with gr.Blocks(theme=gr.themes.Soft()) as demo:
145
+ gr.Markdown(DESCRIPTION)
146
+
147
+ with gr.Row():
148
+ with gr.Column(scale=1):
149
+ # --- Input Components ---
150
+ input_files = gr.Files(
151
+ label="Upload Manga Pages",
152
+ file_types=["image"],
153
+ file_count="multiple"
154
+ )
155
+
156
+ method = gr.Radio(
157
+ label="Processing Method",
158
+ choices=["Traditional", "AI"],
159
+ value="Traditional",
160
+ interactive=True
161
+ )
162
+
163
+ # --- Output Options ---
164
+ gr.Markdown("### Output Options")
165
+ separate_folders = gr.Checkbox(
166
+ label="Create a separate folder for each image inside the ZIP",
167
+ value=True,
168
+ info="If unchecked, all panels will be in the root of the ZIP, with filenames prefixed by the original image name."
169
+ )
170
+
171
+ # --- Shared Parameters ---
172
+ gr.Markdown("### Shared Parameters")
173
+ merge_mode = gr.Dropdown(
174
+ label="Merge Mode",
175
+ choices=['none', 'vertical', 'horizontal'],
176
+ value='none',
177
+ info="How to merge detected panels before saving."
178
+ )
179
+
180
+ # --- Method-specific Parameters ---
181
+ with gr.Group(visible=True) as traditional_params:
182
+ gr.Markdown("### Traditional Method Parameters")
183
+ split_joint = gr.Checkbox(label="Split Joint Panels", value=False)
184
+ fallback = gr.Checkbox(label="Fallback to Threshold Extraction", value=True)
185
+ output_mode = gr.Dropdown(
186
+ label="Output Mode",
187
+ choices=['bounding', 'masked'],
188
+ value='bounding'
189
+ )
190
+
191
+ with gr.Group(visible=False) as ai_params:
192
+ gr.Markdown("### AI Method Parameters")
193
+ gr.Markdown("_(Currently, only the shared 'Merge Mode' parameter is used by the AI method.)_")
194
+
195
+ # --- UI Logic to show/hide parameter groups ---
196
+ def toggle_parameter_visibility(selected_method):
197
+ if selected_method == "Traditional":
198
+ return gr.update(visible=True), gr.update(visible=False)
199
+ elif selected_method == "AI":
200
+ return gr.update(visible=False), gr.update(visible=True)
201
+
202
+ method.change(
203
+ fn=toggle_parameter_visibility,
204
+ inputs=method,
205
+ outputs=[traditional_params, ai_params]
206
+ )
207
+
208
+ # --- Action Button ---
209
+ generate_button = gr.Button("Generate Panels", variant="primary")
210
+
211
+ with gr.Column(scale=1):
212
+ # --- Output Component ---
213
+ output_zip = gr.File(label="Download ZIP")
214
+
215
+ # --- Button Click Action ---
216
+ generate_button.click(
217
+ fn=process_images,
218
+ inputs=[
219
+ input_files,
220
+ method,
221
+ separate_folders,
222
+ merge_mode,
223
+ split_joint,
224
+ fallback,
225
+ output_mode
226
+ ],
227
+ outputs=output_zip
228
+ )
229
+
230
+ demo.launch(inbrowser=True)
231
+
232
+ if __name__ == "__main__":
233
+ main()
image_processing/__init__.py ADDED
File without changes
image_processing/image.py ADDED
@@ -0,0 +1,155 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import cv2
2
+ import numpy as np
3
+
4
+
5
+ def apply_adaptive_threshold(image: np.ndarray) -> np.ndarray:
6
+ """
7
+ Applies adaptive threshold to the given image
8
+ """
9
+ return cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 5, 0)
10
+
11
+
12
+ def is_contour_rectangular(contour: np.ndarray) -> bool:
13
+ """
14
+ Returns whether the given contour is rectangular or not
15
+ """
16
+ num_sides = 4
17
+ perimeter = cv2.arcLength(contour, True)
18
+ approx = cv2.approxPolyDP(contour, 0.01 * perimeter, True)
19
+ return len(approx) == num_sides
20
+
21
+
22
+ def adaptive_vconcat(images: list[np.ndarray], fill_color: tuple[int, int, int] = (255, 255, 255)) -> np.ndarray:
23
+ max_width = max(img.shape[1] for img in images)
24
+
25
+ # Resize each image to match the largest dimensions
26
+ resized_images = []
27
+ for img in images:
28
+ resized_img = cv2.copyMakeBorder(img,
29
+ top=0, bottom=0,
30
+ left=0, right=max_width - img.shape[1],
31
+ borderType=cv2.BORDER_CONSTANT,
32
+ value=fill_color)
33
+ resized_images.append(resized_img)
34
+
35
+ # Concatenate vertically
36
+ return np.vstack(resized_images)
37
+
38
+
39
+ def adaptive_hconcat(images: list[np.ndarray], fill_color: tuple[int, int, int] = (255, 255, 255)) -> np.ndarray:
40
+ max_height = max(img.shape[0] for img in images)
41
+
42
+ # Resize each image to match the largest dimensions
43
+ resized_images = []
44
+ for img in images:
45
+ resized_img = cv2.copyMakeBorder(img,
46
+ top=0, bottom=max_height - img.shape[0],
47
+ left=0, right=0,
48
+ borderType=cv2.BORDER_CONSTANT,
49
+ value=fill_color)
50
+ resized_images.append(resized_img)
51
+
52
+ # Concatenate horizontally
53
+ return np.hstack(resized_images)
54
+
55
+
56
+ def group_contours_vertically(contours) -> list[list[np.ndarray]]:
57
+ """
58
+ Groups the given contours vertically
59
+ """
60
+ ERROR_THRESHOLD = 0.05
61
+ contours = sorted(contours, key=lambda c: cv2.boundingRect(c)[1])
62
+ grouped_contours = [[contours[0]]]
63
+ for contour in contours[1:]:
64
+ found_group = False
65
+ contour_x, contour_y, contour_w, contour_h = cv2.boundingRect(contour)
66
+ for group in grouped_contours[::-1]:
67
+ group_x, group_y, group_w, group_h = cv2.boundingRect(group[-1])
68
+ y_diff = abs(contour_y - group_y) - group_h
69
+ if y_diff < 0 or y_diff > min(contour_h, group_h):
70
+ continue
71
+ group_x_center = group_x + group_w / 2
72
+ contour_x_center = contour_x + contour_w / 2
73
+ if abs(group_x_center - contour_x_center) < ERROR_THRESHOLD * min(group_w, contour_w):
74
+ group.append(contour)
75
+ found_group = True
76
+ break
77
+ if not found_group:
78
+ grouped_contours.append([contour])
79
+ return grouped_contours
80
+
81
+
82
+ def group_contours_horizontally(contours) -> list[list[np.ndarray]]:
83
+ """
84
+ Groups the given contours horizontally
85
+ """
86
+ ERROR_THRESHOLD = 0.05
87
+ contours = sorted(contours, key=lambda c: cv2.boundingRect(c)[0])
88
+ grouped_contours = [[contours[0]]]
89
+ for contour in contours[1:]:
90
+ found_group = False
91
+ contour_x, contour_y, contour_w, contour_h = cv2.boundingRect(contour)
92
+ for group in grouped_contours[::-1]:
93
+ group_x, group_y, group_w, group_h = cv2.boundingRect(group[-1])
94
+ x_diff = abs(contour_x - group_x) - group_w
95
+ if x_diff < 0 or x_diff > min(contour_w, group_w):
96
+ continue
97
+ group_y_center = group_y + group_h / 2
98
+ contour_y_center = contour_y + contour_h / 2
99
+ if abs(group_y_center - contour_y_center) < ERROR_THRESHOLD * min(group_h, contour_h):
100
+ group.append(contour)
101
+ found_group = True
102
+ break
103
+ if not found_group:
104
+ grouped_contours.append([contour])
105
+ return grouped_contours
106
+
107
+ def group_bounding_boxes_vertically(bounding_boxes) -> list[list[tuple[int, int, int, int]]]:
108
+ """
109
+ Groups the given bounding boxes vertically
110
+ """
111
+ ERROR_THRESHOLD = 0.05
112
+ bounding_boxes = sorted(bounding_boxes, key=lambda bb: bb[1])
113
+ grouped_bounding_boxes = [[bounding_boxes[0]]]
114
+ for bounding_box in bounding_boxes[1:]:
115
+ found_group = False
116
+ bb_x, bb_y, bb_w, bb_h = bounding_box
117
+ for group in grouped_bounding_boxes[::-1]:
118
+ group_x, group_y, group_w, group_h = group[-1]
119
+ y_diff = abs(bb_y - group_y) - group_h
120
+ if y_diff < 0 or y_diff > min(bb_h, group_h):
121
+ continue
122
+ group_x_center = group_x + group_w / 2
123
+ bb_x_center = bb_x + bb_w / 2
124
+ if abs(group_x_center - bb_x_center) < ERROR_THRESHOLD * min(group_w, bb_w):
125
+ group.append(bounding_box)
126
+ found_group = True
127
+ break
128
+ if not found_group:
129
+ grouped_bounding_boxes.append([bounding_box])
130
+ return grouped_bounding_boxes
131
+
132
+ def group_bounding_boxes_horizontally(bounding_boxes) -> list[list[tuple[int, int, int, int]]]:
133
+ """
134
+ Groups the given bounding boxes horizontally
135
+ """
136
+ ERROR_THRESHOLD = 0.05
137
+ bounding_boxes = sorted(bounding_boxes, key=lambda bb: bb[0])
138
+ grouped_bounding_boxes = [[bounding_boxes[0]]]
139
+ for bounding_box in bounding_boxes[1:]:
140
+ found_group = False
141
+ bb_x, bb_y, bb_w, bb_h = bounding_box
142
+ for group in grouped_bounding_boxes[::-1]:
143
+ group_x, group_y, group_w, group_h = group[-1]
144
+ x_diff = abs(bb_x - group_x) - group_w
145
+ if x_diff < 0 or x_diff > min(bb_w, group_w):
146
+ continue
147
+ group_y_center = group_y + group_h / 2
148
+ bb_y_center = bb_y + bb_h / 2
149
+ if abs(group_y_center - bb_y_center) < ERROR_THRESHOLD * min(group_h, bb_h):
150
+ group.append(bounding_box)
151
+ found_group = True
152
+ break
153
+ if not found_group:
154
+ grouped_bounding_boxes.append([bounding_box])
155
+ return grouped_bounding_boxes
image_processing/model.py ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ class Model:
2
+ def __init__(self):
3
+ self.model = None
4
+ self.imported = False
5
+
6
+ def load(self):
7
+ if self.model is None:
8
+ self.__load()
9
+
10
+ def __load(self):
11
+ if not self.imported:
12
+ self.imported = True
13
+ import torch
14
+ import pathlib
15
+ import sys
16
+ import os
17
+ from myutils.respath import resource_path
18
+
19
+ # Redirect sys.stderr to a file or a valid stream
20
+ if sys.stderr is None:
21
+ sys.stderr = open(os.devnull, 'w')
22
+
23
+ temp = pathlib.PosixPath
24
+ pathlib.PosixPath = pathlib.WindowsPath
25
+ self.model = torch.hub.load('ultralytics/yolov5', 'custom', path=resource_path('ai-models/2024-11-00/best.pt'))
26
+ pathlib.PosixPath = temp
27
+
28
+ def __call__(self, *args, **kwds):
29
+ if self.model is None:
30
+ self.__load()
31
+ return self.model(*args, **kwds)
32
+
33
+ model = Model()
image_processing/panel.py ADDED
@@ -0,0 +1,464 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from typing import Callable
3
+ import cv2
4
+ import warnings
5
+ import numpy as np
6
+ from image_processing.image import is_contour_rectangular, apply_adaptive_threshold, group_contours_horizontally, group_contours_vertically, adaptive_hconcat, adaptive_vconcat, group_bounding_boxes_horizontally, group_bounding_boxes_vertically
7
+ from myutils.myutils import load_images, load_image
8
+ from tqdm import tqdm
9
+ from image_processing.model import model
10
+
11
+ class OutputMode:
12
+ BOUNDING = 'bounding'
13
+ MASKED = 'masked'
14
+
15
+ def from_index(index: int) -> str:
16
+ return [OutputMode.BOUNDING, OutputMode.MASKED][index]
17
+
18
+
19
+ class MergeMode:
20
+ NONE = 'none'
21
+ VERTICAL = 'vertical'
22
+ HORIZONTAL = 'horizontal'
23
+
24
+ def from_index(index: int) -> str:
25
+ return [MergeMode.NONE, MergeMode.VERTICAL, MergeMode.HORIZONTAL][index]
26
+
27
+
28
+ def get_background_intensity_range(grayscale_image: np.ndarray, min_range: int = 1) -> tuple[int, int]:
29
+ """
30
+ Returns the minimum and maximum intensity values of the background of the image
31
+ """
32
+ edges = [grayscale_image[-1, :], grayscale_image[0, :], grayscale_image[:, 0], grayscale_image[:, -1]]
33
+ sorted_edges = sorted(edges, key=lambda x: np.var(x))
34
+
35
+ least_varied_edge = sorted_edges[0]
36
+
37
+ max_intensity = max(least_varied_edge)
38
+ min_intensity = max(min(min(least_varied_edge), max_intensity - min_range), 0)
39
+
40
+ return min_intensity, max_intensity
41
+
42
+
43
+ def generate_background_mask(grayscale_image: np.ndarray) -> np.ndarray:
44
+ """
45
+ Generates a mask by focusing on the largest area of white pixels
46
+ """
47
+ WHITE = 255
48
+ LESS_WHITE, _ = get_background_intensity_range(grayscale_image, 25)
49
+ LESS_WHITE = max(LESS_WHITE, 240)
50
+
51
+ ret, thresh = cv2.threshold(grayscale_image, LESS_WHITE, WHITE, cv2.THRESH_BINARY)
52
+ nlabels, labels, stats, centroids = cv2.connectedComponentsWithStats(thresh)
53
+
54
+ mask = np.zeros_like(thresh)
55
+
56
+ PAGE_TO_SEGMENT_RATIO = 1024
57
+
58
+ halting_area_size = mask.size // PAGE_TO_SEGMENT_RATIO
59
+
60
+ mask_height, mask_width = mask.shape
61
+ base_background_size_error_threshold = 0.05
62
+ whole_background_min_width = mask_width * (1 - base_background_size_error_threshold)
63
+ whole_background_min_height = mask_height * (1 - base_background_size_error_threshold)
64
+
65
+ for i in np.argsort(stats[1:, 4])[::-1]:
66
+ contour_index = i + 1
67
+ x, y, w, h, area = stats[contour_index]
68
+ if area < halting_area_size:
69
+ break
70
+ if (
71
+ (w > whole_background_min_width) or
72
+ (h > whole_background_min_height) or
73
+ (is_contour_rectangular(cv2.findContours((labels == contour_index).astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[0][0]))
74
+ ):
75
+ mask[labels == contour_index] = WHITE
76
+
77
+ mask = cv2.dilate(mask, np.ones((3, 3), np.uint8), iterations=2)
78
+
79
+ return mask
80
+
81
+
82
+ def extract_panels(
83
+ image: np.ndarray,
84
+ panel_contours: list[np.ndarray],
85
+ accept_page_as_panel: bool = True,
86
+ mode: str = OutputMode.BOUNDING,
87
+ fill_in_color: tuple[int, int, int] = (0, 0, 0),
88
+ ) -> list[np.ndarray]:
89
+ """
90
+ Extracts panels from the image using the given contours corresponding to the panels
91
+
92
+ Parameters:
93
+ - image: The image to extract the panels from
94
+ - panel_contours: The contours corresponding to the panels
95
+ - accept_page_as_panel: Whether to accept the whole page as a panel
96
+ - mode: The mode to use for extraction
97
+ - 'masked': Extracts the panels by cuting out only the inside of the contours
98
+ - 'bounding': Extracts the panels by using the bounding boxes of the contours
99
+ - fill_in_color: The color to fill in the background of the panel images
100
+ """
101
+ height, width = image.shape[:2]
102
+
103
+ returned_panels = []
104
+
105
+ for contour in panel_contours:
106
+ x, y, w, h = cv2.boundingRect(contour)
107
+
108
+ if not accept_page_as_panel and ((w >= width * 0.99) or (h >= height * 0.99)):
109
+ continue
110
+
111
+ if mode == 'masked':
112
+ mask = np.zeros_like(image)
113
+ cv2.drawContours(mask, [contour], -1, (255, 255, 255), -1)
114
+ masked_image = cv2.bitwise_and(image, mask)
115
+ fitted_panel = masked_image[y:y + h, x:x + w]
116
+ fitted_panel = cv2.bitwise_or(cv2.bitwise_and(cv2.bitwise_not(mask[y:y + h, x:x + w]), fill_in_color), fitted_panel)
117
+ else:
118
+ fitted_panel = image[y:y + h, x:x + w]
119
+
120
+ returned_panels.append(fitted_panel)
121
+
122
+ return returned_panels
123
+
124
+
125
+ def preprocess_image(grayscale_image: np.ndarray) -> np.ndarray:
126
+ """
127
+ Preprocesses the image for panel extraction
128
+ """
129
+ processed_image = cv2.GaussianBlur(grayscale_image, (3, 3), 0)
130
+ processed_image = cv2.Laplacian(processed_image, -1)
131
+ return processed_image
132
+
133
+
134
+ def preprocess_image_with_dilation(grayscale_image: np.ndarray) -> np.ndarray:
135
+ """
136
+ Preprocesses the image for panel extraction
137
+ """
138
+ processed_image = cv2.GaussianBlur(grayscale_image, (3, 3), 0)
139
+ processed_image = cv2.Laplacian(processed_image, -1)
140
+ processed_image = cv2.dilate(processed_image, np.ones((5, 5), np.uint8), iterations=1)
141
+ processed_image = 255 - processed_image
142
+ return processed_image
143
+
144
+
145
+ def joint_panel_split_extraction(grayscale_image: np.ndarray, background_mask: np.ndarray) -> np.ndarray:
146
+ """
147
+ Extracts the panels from the image with splitting the joint panels
148
+ """
149
+ pixels_before = np.count_nonzero(background_mask)
150
+ background_mask = cv2.ximgproc.thinning(background_mask)
151
+
152
+ up_kernel = np.array([[0, 0, 0], [0, 1, 0], [0, 1, 0]], np.uint8)
153
+ down_kernel = np.array([[0, 1, 0], [0, 1, 0], [0, 0, 0]], np.uint8)
154
+ left_kernel = np.array([[0, 0, 0], [0, 1, 1], [0, 0, 0]], np.uint8)
155
+ right_kernel = np.array([[0, 0, 0], [1, 1, 0], [0, 0, 0]], np.uint8)
156
+
157
+ down_right_diagonal_kernel = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 0]], np.uint8)
158
+ down_left_diagonal_kernel = np.array([[0, 0, 1], [0, 1, 0], [0, 0, 0]], np.uint8)
159
+ up_left_diagonal_kernel = np.array([[0, 0, 0], [0, 1, 0], [0, 0, 1]], np.uint8)
160
+ up_right_diagonal_kernel = np.array([[0, 0, 0], [0, 1, 0], [1, 0, 0]], np.uint8)
161
+
162
+ PAGE_TO_JOINT_OBJECT_RATIO = 3
163
+ image_height, image_width = grayscale_image.shape
164
+
165
+ height_based_size = image_height // PAGE_TO_JOINT_OBJECT_RATIO
166
+ width_based_size = (2 * image_width) // PAGE_TO_JOINT_OBJECT_RATIO
167
+
168
+ height_based_size += height_based_size % 2 + 1
169
+ width_based_size += width_based_size % 2 + 1
170
+
171
+ up_dilation_kernel = np.zeros((height_based_size, height_based_size), np.uint8)
172
+ up_dilation_kernel[height_based_size // 2:, height_based_size // 2] = 1
173
+
174
+ down_dilation_kernel = np.zeros((height_based_size, height_based_size), np.uint8)
175
+ down_dilation_kernel[:height_based_size // 2 + 1, height_based_size // 2] = 1
176
+
177
+ left_dilation_kernel = np.zeros((width_based_size, width_based_size), np.uint8)
178
+ left_dilation_kernel[width_based_size // 2, width_based_size // 2:] = 1
179
+
180
+ right_dilation_kernel = np.zeros((width_based_size, width_based_size), np.uint8)
181
+ right_dilation_kernel[width_based_size // 2, :width_based_size // 2 + 1] = 1
182
+
183
+ min_based_size = min(width_based_size, height_based_size)
184
+
185
+ down_right_dilation_kernel = np.identity(min_based_size // 2 + 1, dtype=np.uint8)
186
+ down_right_dilation_kernel = np.pad(down_right_dilation_kernel, ((0, min_based_size // 2), (0, min_based_size // 2)))
187
+
188
+ up_left_dilation_kernel = np.identity(min_based_size // 2 + 1, dtype=np.uint8)
189
+ up_left_dilation_kernel = np.pad(up_left_dilation_kernel, ((min_based_size // 2, 0), (0, min_based_size // 2)))
190
+
191
+ up_right_dilation_kernel = np.flip(np.identity(min_based_size // 2 + 1, dtype=np.uint8), axis=1)
192
+ up_right_dilation_kernel = np.pad(up_right_dilation_kernel, ((min_based_size // 2, 0), (0, min_based_size // 2)))
193
+
194
+ down_left_dilation_kernel = np.flip(np.identity(min_based_size // 2 + 1, dtype=np.uint8), axis=1)
195
+ down_left_dilation_kernel = np.pad(down_left_dilation_kernel, ((0, min_based_size // 2), (min_based_size // 2, 0)))
196
+
197
+ match_kernels = [
198
+ up_kernel,
199
+ down_kernel,
200
+ left_kernel,
201
+ right_kernel,
202
+ down_right_diagonal_kernel,
203
+ down_left_diagonal_kernel,
204
+ up_left_diagonal_kernel,
205
+ up_right_diagonal_kernel,
206
+ ]
207
+
208
+ dilation_kernels = [
209
+ up_dilation_kernel,
210
+ down_dilation_kernel,
211
+ left_dilation_kernel,
212
+ right_dilation_kernel,
213
+ down_right_dilation_kernel,
214
+ down_left_dilation_kernel,
215
+ up_left_dilation_kernel,
216
+ up_right_dilation_kernel,
217
+ ]
218
+
219
+ def get_dots(grayscale_image: np.ndarray, kernel: np.ndarray) -> tuple[np.ndarray, int]:
220
+ temp = cv2.matchTemplate(grayscale_image, kernel, cv2.TM_CCOEFF_NORMED)
221
+ _, temp = cv2.threshold(temp, 0.9, 1, cv2.THRESH_BINARY)
222
+ temp = np.where(temp == 1, 255, 0).astype(np.uint8)
223
+ pad_height = (kernel.shape[0] - 1) // 2
224
+ pad_width = (kernel.shape[1] - 1) // 2
225
+ temp = cv2.copyMakeBorder(temp, pad_height, kernel.shape[0] - pad_height - 1, pad_width, kernel.shape[1] - pad_width - 1, cv2.BORDER_CONSTANT, value=0)
226
+ return temp
227
+
228
+ for match_kernel, dilation_kernel in zip(match_kernels, dilation_kernels):
229
+ dots = get_dots(background_mask, match_kernel)
230
+ lines = cv2.dilate(dots, dilation_kernel, iterations=1)
231
+ background_mask = cv2.bitwise_or(background_mask, lines)
232
+
233
+ pixels_now = np.count_nonzero(background_mask)
234
+ dilation_size = pixels_before // (4 * pixels_now)
235
+ dilation_size += dilation_size % 2 + 1
236
+ background_mask = cv2.dilate(background_mask, np.ones((dilation_size, dilation_size), np.uint8), iterations=1)
237
+
238
+ page_without_background = 255 - background_mask
239
+
240
+ return page_without_background
241
+
242
+
243
+ def is_contour_sufficiently_big(contour: np.ndarray, image_height: int, image_width: int) -> bool:
244
+ PAGE_TO_PANEL_RATIO = 32
245
+ image_area = image_width * image_height
246
+ area_threshold = image_area // PAGE_TO_PANEL_RATIO
247
+ area = cv2.contourArea(contour)
248
+ return area > area_threshold
249
+
250
+
251
+ def threshold_extraction(
252
+ image: np.ndarray,
253
+ grayscale_image: np.ndarray,
254
+ mode: str = OutputMode.BOUNDING,
255
+ ) -> list[np.ndarray]:
256
+ """
257
+ Extracts panels from the image using thresholding
258
+ """
259
+ processed_image = cv2.GaussianBlur(grayscale_image, (3, 3), 0)
260
+ processed_image = cv2.Laplacian(processed_image, -1)
261
+ _, thresh = cv2.threshold(processed_image, 8, 255, cv2.THRESH_BINARY)
262
+ processed_image = apply_adaptive_threshold(processed_image)
263
+ processed_image = cv2.subtract(processed_image, thresh)
264
+ processed_image = cv2.dilate(processed_image, np.ones((3, 3), np.uint8), iterations=2)
265
+ contours, _ = cv2.findContours(processed_image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
266
+ contours = list(filter(lambda c: is_contour_sufficiently_big(c, image.shape[0], image.shape[1]), contours))
267
+ panels = extract_panels(image, contours, False, mode=mode)
268
+
269
+ return panels
270
+
271
+
272
+ def get_page_without_background(grayscale_image: np.ndarray, background_mask: np.ndarray, split_joint_panels = False) -> np.ndarray:
273
+ """
274
+ Returns the page without the background
275
+ """
276
+ STRIPE_FORMAT_MASK_AREA_RATIO = 0.3
277
+
278
+ mask_area = np.count_nonzero(background_mask)
279
+ mask_area_ratio = mask_area / background_mask.size
280
+
281
+ if STRIPE_FORMAT_MASK_AREA_RATIO > mask_area_ratio and split_joint_panels:
282
+ page_without_background = joint_panel_split_extraction(grayscale_image, background_mask)
283
+ else:
284
+ page_without_background = cv2.subtract(grayscale_image, background_mask)
285
+
286
+ return page_without_background
287
+
288
+
289
+ def get_fallback_panels(
290
+ image: np.ndarray,
291
+ grayscale_image: np.ndarray,
292
+ fallback: bool,
293
+ panels: list[np.ndarray],
294
+ mode: str = OutputMode.BOUNDING,
295
+ ) -> list[np.ndarray]:
296
+ """
297
+ Checks if the fallback is needed and returns the appropriate panels
298
+
299
+ Parameters:
300
+ - mode: The mode to use for extraction
301
+ - 'masked': Extracts the panels by cuting out only the inside of the contours
302
+ - 'bounding': Extracts the panels by using the bounding boxes of the contours
303
+ """
304
+ if fallback and len(panels) < 2:
305
+ tmp = threshold_extraction(image, grayscale_image, mode=mode)
306
+ if len(tmp) > len(panels):
307
+ return tmp
308
+
309
+ return panels
310
+
311
+
312
+ def generate_panel_blocks(
313
+ image: np.ndarray,
314
+ background_generator: Callable[[np.ndarray], np.ndarray] = generate_background_mask,
315
+ split_joint_panels: bool = False,
316
+ fallback: bool = True,
317
+ mode: str = OutputMode.BOUNDING,
318
+ merge: str = MergeMode.NONE
319
+ ) -> list[np.ndarray]:
320
+ """
321
+ Generates the separate panel images from the base image
322
+
323
+ Parameters:
324
+ - mode: The mode to use for extraction
325
+ - 'masked': Extracts the panels by cuting out only the inside of the contours
326
+ - 'bounding': Extracts the panels by using the bounding boxes of the contours
327
+ """
328
+
329
+ grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
330
+ processed_image = preprocess_image_with_dilation(grayscale_image)
331
+ background_mask = background_generator(processed_image)
332
+ page_without_background = get_page_without_background(grayscale_image, background_mask, split_joint_panels)
333
+ contours, _ = cv2.findContours(page_without_background, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
334
+ contours = list(filter(lambda c: is_contour_sufficiently_big(c, image.shape[0], image.shape[1]), contours))
335
+
336
+ def get_panels(contours):
337
+ panels = extract_panels(image, contours, mode=mode)
338
+ panels = get_fallback_panels(image, grayscale_image, fallback, panels, mode=mode)
339
+ return panels
340
+
341
+ panels = []
342
+ if merge == MergeMode.NONE:
343
+ panels = get_panels(contours)
344
+ elif merge == MergeMode.HORIZONTAL:
345
+ grouped_contours = group_contours_horizontally(contours)
346
+ for group in grouped_contours:
347
+ panels.append(adaptive_hconcat(get_panels(group)))
348
+ elif merge == MergeMode.VERTICAL:
349
+ grouped_contours = group_contours_vertically(contours)
350
+ for group in grouped_contours:
351
+ panels.append(adaptive_vconcat(get_panels(group)))
352
+
353
+ return panels
354
+
355
+
356
+ def generate_panel_blocks_by_ai(image: np.ndarray, merge: str = MergeMode.NONE) -> list[np.ndarray]:
357
+ """
358
+ Generates the separate panel images from the base image using AI with merge
359
+ """
360
+ grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
361
+ processed_image = preprocess_image(grayscale_image)
362
+
363
+ warnings.filterwarnings("ignore", category=FutureWarning) # Ignore 'FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.'
364
+ results = model(processed_image)
365
+ warnings.filterwarnings("default", category=FutureWarning)
366
+
367
+ bounding_boxes = []
368
+ for detection in results.xyxy[0]: # Access predictions in (x1, y1, x2, y2, confidence, class) format
369
+ x1, y1, x2, y2, conf, cls = detection.tolist() # Convert to Python list
370
+ x1, y1, x2, y2 = map(int, [x1, y1, x2, y2])
371
+ bounding_boxes.append((x1, y1, x2 - x1, y2 - y1))
372
+
373
+ def get_panels(bounding_boxes):
374
+ panels = []
375
+ for x, y, w, h in bounding_boxes:
376
+ panel = image[y:y + h, x:x + w]
377
+ panels.append(panel)
378
+ return panels
379
+
380
+ panels = []
381
+ if merge == MergeMode.NONE:
382
+ panels = get_panels(bounding_boxes)
383
+ elif merge == MergeMode.HORIZONTAL:
384
+ grouped_bounding_boxes = group_bounding_boxes_horizontally(bounding_boxes)
385
+ for group in grouped_bounding_boxes:
386
+ panels.append(adaptive_hconcat(get_panels(group)))
387
+ elif merge == MergeMode.VERTICAL:
388
+ grouped_bounding_boxes = group_bounding_boxes_vertically(bounding_boxes)
389
+ for group in grouped_bounding_boxes:
390
+ panels.append(adaptive_vconcat(get_panels(group)))
391
+
392
+ return panels
393
+
394
+
395
+ def extract_panels_for_image(
396
+ image_path: str,
397
+ output_dir: str,
398
+ fallback: bool = True,
399
+ split_joint_panels: bool = False,
400
+ mode: str = OutputMode.BOUNDING,
401
+ merge: str = MergeMode.NONE
402
+ ) -> None:
403
+ """
404
+ Extracts panels for a single image
405
+ """
406
+ if not os.path.exists(image_path):
407
+ return
408
+ image_path = os.path.abspath(image_path)
409
+ image = load_image(os.path.dirname(image_path), image_path)
410
+ image_name, image_ext = os.path.splitext(image.image_name)
411
+ panel_blocks = generate_panel_blocks(image.image, split_joint_panels=split_joint_panels, fallback=fallback, mode=mode, merge=merge)
412
+ for k, panel in enumerate(tqdm(panel_blocks, total=len(panel_blocks))):
413
+ out_path = os.path.join(output_dir, f"{image_name}_{k}{image_ext}")
414
+ cv2.imwrite(out_path, panel)
415
+
416
+
417
+ def extract_panels_for_images_in_folder(
418
+ input_dir: str,
419
+ output_dir: str,
420
+ fallback: bool = True,
421
+ split_joint_panels: bool = False,
422
+ mode: str = OutputMode.BOUNDING,
423
+ merge: str = MergeMode.NONE
424
+ ) -> tuple[int, int]:
425
+ """
426
+ Basically the main function of the program,
427
+ this is written with cli usage in mind
428
+ """
429
+ if not os.path.exists(output_dir):
430
+ return (0, 0)
431
+ files = os.listdir(input_dir)
432
+ num_files = len(files)
433
+ num_panels = 0
434
+ for _, image in enumerate(tqdm(load_images(input_dir), total=num_files)):
435
+ image_name, image_ext = os.path.splitext(image.image_name)
436
+ panel_blocks = generate_panel_blocks(image.image, fallback=fallback, split_joint_panels=split_joint_panels, mode=mode, merge=merge)
437
+ for j, panel in enumerate(panel_blocks):
438
+ out_path = os.path.join(output_dir, f"{image_name}_{j}{image_ext}")
439
+ cv2.imwrite(out_path, panel)
440
+ num_panels += len(panel_blocks)
441
+ return (num_files, num_panels)
442
+
443
+ def extract_panels_for_images_in_folder_by_ai(
444
+ input_dir: str,
445
+ output_dir: str
446
+ ) -> tuple[int, int]:
447
+ """
448
+ Basically the main function of the program,
449
+ this is written with cli usage in mind
450
+ """
451
+ if not os.path.exists(output_dir):
452
+ return (0, 0)
453
+ files = os.listdir(input_dir)
454
+ num_files = len(files)
455
+ num_panels = 0
456
+ for _, image in enumerate(tqdm(load_images(input_dir), total=num_files)):
457
+ image_name, image_ext = os.path.splitext(image.image_name)
458
+ panel_blocks = generate_panel_blocks_by_ai(image.image)
459
+ for j, panel in enumerate(panel_blocks):
460
+ out_path = os.path.join(output_dir, f"{image_name}_{j}{image_ext}")
461
+ cv2.imwrite(out_path, panel)
462
+ num_panels += len(panel_blocks)
463
+ return (num_files, num_panels)
464
+
myutils/__init__.py ADDED
File without changes
myutils/myutils.py ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from dataclasses import dataclass
3
+ import cv2
4
+ import numpy as np
5
+
6
+ supported_types = [
7
+ ".bmp",
8
+ ".dib",
9
+ ".jpeg",
10
+ ".jpg",
11
+ ".jpe",
12
+ ".jp2",
13
+ ".png",
14
+ ".webp",
15
+ ".pbm",
16
+ ".pgm",
17
+ ".pp",
18
+ ".pxm",
19
+ ".pnm",
20
+ ".pfm",
21
+ ".sr",
22
+ ".ras",
23
+ ".tiff",
24
+ ".tif",
25
+ ".exr",
26
+ ".hdr",
27
+ ".pic",
28
+ ]
29
+
30
+
31
+ @dataclass
32
+ class ImageWithFilename:
33
+ image: np.ndarray
34
+ image_name: str
35
+
36
+
37
+ def get_file_extension(file_path: str) -> str:
38
+ """
39
+ Returns the extension of the given file path
40
+ """
41
+ return os.path.splitext(file_path)[1]
42
+
43
+
44
+ def load_image(directory_path: str, image_name: str) -> ImageWithFilename:
45
+ """
46
+ Returns an ImageWithFilename object from the given image name in the given directory
47
+ """
48
+ image = cv2.imread(os.path.join(directory_path, image_name))
49
+ return ImageWithFilename(image, image_name)
50
+
51
+
52
+ def load_images(directory_path: str) -> list[ImageWithFilename]:
53
+ """
54
+ Returns a list of ImageWithFilename objects from the images in the given directory
55
+ """
56
+ file_names = get_file_names(directory_path)
57
+ image_names = filter(lambda x: get_file_extension(x) in supported_types, file_names)
58
+ return [load_image(directory_path, image_name) for image_name in image_names]
59
+
60
+
61
+ def get_file_names(directory_path: str) -> list[str]:
62
+ """
63
+ Returns the names of the files in the given directory
64
+ """
65
+ if not os.path.exists(directory_path):
66
+ return []
67
+ return [
68
+ file_name
69
+ for file_name in os.listdir(directory_path)
70
+ if os.path.isfile(os.path.join(directory_path, file_name))
71
+ ]
myutils/respath.py ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import sys
3
+
4
+ build_paths = dict([(os.path.normpath(x[0]), os.path.normpath(x[1])) for x in [
5
+ ("icon.ico", "icon.ico"),
6
+ ("ai-models/2024-11-00/best.pt", "models/best.pt"),
7
+ ]])
8
+
9
+ # Function to get the correct path to bundled resources
10
+ def resource_path(relative_path: str) -> str:
11
+ relative_path = os.path.normpath(relative_path)
12
+ if hasattr(sys, '_MEIPASS'):
13
+ # Running in a PyInstaller bundle
14
+ return build_paths[relative_path]
15
+ else:
16
+ return relative_path
requirements.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ --extra-index-url https://download.pytorch.org/whl/cu126
2
+
3
+ gradio
4
+ numpy
5
+ opencv-contrib-python
6
+ tqdm
7
+ torch
8
+ yolov5
webui.bat ADDED
@@ -0,0 +1,162 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ @echo off
2
+
3
+ :: The original source of the webui.bat file is stable-diffusion-webui
4
+ :: Modified and enhanced by Gemini with features for venv management and requirements handling.
5
+
6
+ :: --------- Configuration ---------
7
+ set COMMANDLINE_ARGS=
8
+ :: Define the name of the Launch application
9
+ set APPLICATION_NAME=app.py
10
+ :: Define the name of the virtual environment directory
11
+ set VENV_NAME=venv
12
+ :: Set to 1 to always attempt to update packages from requirements.txt on every launch
13
+ set ALWAYS_UPDATE_REQS=0
14
+ :: ---------------------------------
15
+
16
+
17
+ :: Set PYTHON executable if not already defined
18
+ if not defined PYTHON (set PYTHON=python)
19
+ :: Set VENV_DIR using VENV_NAME if not already defined
20
+ if not defined VENV_DIR (set "VENV_DIR=%~dp0%VENV_NAME%")
21
+
22
+ mkdir tmp 2>NUL
23
+
24
+ :: Check if Python is callable
25
+ %PYTHON% -c "" >tmp/stdout.txt 2>tmp/stderr.txt
26
+ if %ERRORLEVEL% == 0 goto :check_pip
27
+ echo Couldn't launch python
28
+ goto :show_stdout_stderr
29
+
30
+ :check_pip
31
+ :: Check if pip is available
32
+ %PYTHON% -mpip --help >tmp/stdout.txt 2>tmp/stderr.txt
33
+ if %ERRORLEVEL% == 0 goto :start_venv
34
+ :: If pip is not available and PIP_INSTALLER_LOCATION is set, try to install pip
35
+ if "%PIP_INSTALLER_LOCATION%" == "" goto :show_stdout_stderr
36
+ %PYTHON% "%PIP_INSTALLER_LOCATION%" >tmp/stdout.txt 2>tmp/stderr.txt
37
+ if %ERRORLEVEL% == 0 goto :start_venv
38
+ echo Couldn't install pip
39
+ goto :show_stdout_stderr
40
+
41
+ :start_venv
42
+ :: Skip venv creation/activation if VENV_DIR is explicitly set to "-"
43
+ if ["%VENV_DIR%"] == ["-"] goto :skip_venv_entirely
44
+ :: Skip venv creation/activation if SKIP_VENV is set to "1"
45
+ if ["%SKIP_VENV%"] == ["1"] goto :skip_venv_entirely
46
+
47
+ :: Check if the venv already exists by looking for Python.exe in its Scripts directory
48
+ dir "%VENV_DIR%\Scripts\Python.exe" >tmp/stdout.txt 2>tmp/stderr.txt
49
+ if %ERRORLEVEL% == 0 goto :activate_venv_and_maybe_update
50
+
51
+ :: Venv does not exist, create it
52
+ echo Virtual environment not found in "%VENV_DIR%". Creating a new one.
53
+ for /f "delims=" %%i in ('CALL %PYTHON% -c "import sys; print(sys.executable)"') do set PYTHON_FULLNAME="%%i"
54
+ echo Creating venv in directory %VENV_DIR% using python %PYTHON_FULLNAME%
55
+ %PYTHON_FULLNAME% -m venv "%VENV_DIR%" >tmp/stdout.txt 2>tmp/stderr.txt
56
+ if %ERRORLEVEL% NEQ 0 (
57
+ echo Unable to create venv in directory "%VENV_DIR%"
58
+ goto :show_stdout_stderr
59
+ )
60
+ echo Venv created.
61
+
62
+ :: Install requirements for the first time if venv was just created
63
+ :: This section handles the initial installation of packages from requirements.txt
64
+ :: immediately after a new virtual environment is created.
65
+ echo Checking for requirements.txt for initial setup in %~dp0
66
+ if exist "%~dp0requirements.txt" (
67
+ echo Found requirements.txt, attempting to install for initial setup...
68
+ call "%VENV_DIR%\Scripts\activate.bat"
69
+ echo Installing packages from requirements.txt ^(initial setup^)...
70
+ "%VENV_DIR%\Scripts\python.exe" -m pip install -r "%~dp0requirements.txt"
71
+ if %ERRORLEVEL% NEQ 0 (
72
+ echo Failed to install requirements during initial setup. Please check the output above.
73
+ pause
74
+ goto :show_stdout_stderr_custom_pip_initial
75
+ )
76
+ echo Initial requirements installed successfully.
77
+ call "%VENV_DIR%\Scripts\deactivate.bat"
78
+ ) else (
79
+ echo No requirements.txt found for initial setup, skipping package installation.
80
+ )
81
+ goto :activate_venv_and_maybe_update
82
+
83
+
84
+ :activate_venv_and_maybe_update
85
+ :: This label is reached if the venv exists or was just created.
86
+ :: Set PYTHON to point to the venv's Python interpreter.
87
+ set PYTHON="%VENV_DIR%\Scripts\Python.exe"
88
+ echo Activating venv: %PYTHON%
89
+
90
+ :: Always update requirements if ALWAYS_UPDATE_REQS is 1
91
+ :: This section allows for updating packages from requirements.txt on every launch
92
+ :: if the ALWAYS_UPDATE_REQS variable is set to 1.
93
+ if defined ALWAYS_UPDATE_REQS (
94
+ if "%ALWAYS_UPDATE_REQS%"=="1" (
95
+ echo ALWAYS_UPDATE_REQS is enabled.
96
+ if exist "%~dp0requirements.txt" (
97
+ echo Attempting to update packages from requirements.txt...
98
+ REM No need to call activate.bat here again, PYTHON is already set to the venv's python
99
+ %PYTHON% -m pip install -r "%~dp0requirements.txt"
100
+ if %ERRORLEVEL% NEQ 0 (
101
+ echo Failed to update requirements. Please check the output above.
102
+ pause
103
+ goto :endofscript
104
+ )
105
+ echo Requirements updated successfully.
106
+ ) else (
107
+ echo ALWAYS_UPDATE_REQS is enabled, but no requirements.txt found. Skipping update.
108
+ )
109
+ ) else (
110
+ echo ALWAYS_UPDATE_REQS is not enabled or not set to 1. Skipping routine update.
111
+ )
112
+ )
113
+
114
+ goto :launch
115
+
116
+ :skip_venv_entirely
117
+ :: This label is reached if venv usage is explicitly skipped.
118
+ echo Skipping venv.
119
+ goto :launch
120
+
121
+ :launch
122
+ :: Launch the main application
123
+ echo Launching Web UI with arguments: %COMMANDLINE_ARGS% %*
124
+ %PYTHON% %APPLICATION_NAME% %COMMANDLINE_ARGS% %*
125
+ echo Launch finished.
126
+ pause
127
+ exit /b
128
+
129
+ :show_stdout_stderr_custom_pip_initial
130
+ :: Custom error handler for failures during the initial pip install process.
131
+ echo.
132
+ echo exit code ^(pip initial install^): %errorlevel%
133
+ echo Errors during initial pip install. See output above.
134
+ echo.
135
+ echo Launch unsuccessful. Exiting.
136
+ pause
137
+ exit /b
138
+
139
+
140
+ :show_stdout_stderr
141
+ :: General error handler: displays stdout and stderr from the tmp directory.
142
+ echo.
143
+ echo exit code: %errorlevel%
144
+
145
+ for /f %%i in ("tmp\stdout.txt") do set size=%%~zi
146
+ if %size% equ 0 goto :show_stderr
147
+ echo.
148
+ echo stdout:
149
+ type tmp\stdout.txt
150
+
151
+ :show_stderr
152
+ for /f %%i in ("tmp\stderr.txt") do set size=%%~zi
153
+ if %size% equ 0 goto :endofscript
154
+ echo.
155
+ echo stderr:
156
+ type tmp\stderr.txt
157
+
158
+ :endofscript
159
+ echo.
160
+ echo Launch unsuccessful. Exiting.
161
+ pause
162
+ exit /b