Borcherding commited on
Commit
62229c3
·
verified ·
1 Parent(s): 4fe8cb7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +280 -3
README.md CHANGED
@@ -1,3 +1,280 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - Borcherding/Hed2CoralReef_Annotations
4
+ tags:
5
+ - hed-to-reef
6
+ - image-to-image
7
+ - cyclegan
8
+ - hed-to-anything
9
+ base_model:
10
+ - keras-io/CycleGAN
11
+ ---
12
+
13
+ # CycleGAN_Hed2CoralReef Model
14
+
15
+ This model transforms HED maps into coral reef style images, and also transforms coral reef style images into estimated HED maps using CycleGAN architecture.
16
+
17
+ <div style="display: flex; flex-wrap: wrap; justify-content: center;">
18
+ <div style="display: flex; width: 100%; justify-content: center; margin-bottom: 10px;">
19
+ <img src="hed2image_fixed_testA/hed2image/test_latest/images/custom_real.png" alt="depth map" title="Depth Map (Input)" width="45%">
20
+ <img src="hed2image_fixed_testA/hed2image/test_latest/images/custom_fake.png" alt="robot-style image" title="Robot-Style Image (Output)" width="45%">
21
+ </div>
22
+ <div style="display: flex; width: 100%; justify-content: center;">
23
+ <img src="hed2image_fixed_testB/hed2image/test_latest/images/custom_real.png" alt="robot-style image" title="Robot-Style Image (Input)" width="45%">
24
+ <img src="hed2image_fixed_testB/hed2image/test_latest/images/custom_fake.png" alt="depth map" title="Depth Map (Output)" width="45%">
25
+ </div>
26
+ </div>
27
+
28
+ ## Model Description
29
+
30
+ - This model was trained on coral reef images generated with SDXL, and their associated HED maps, taken with pytorch-hed:
31
+ [Depth2RobotsV2_Annotations](https://huggingface.co/datasets/Borcherding/Depth2RobotsV2_Annotations)
32
+ - using CycleGAN architecture
33
+ - Training notebooks and dataset genertors can be found in the src folder, and can also be found in the github repo !(Leoleojames1/CycleGANControlNet2Anything)[https://github.com/Leoleojames1/CycleGANControlNet2Anything]
34
+ - It supports bidirectional transformation:
35
+ - HED map → Coral reef-style imagery
36
+ - Robot-style imagery → Depth map
37
+ - The model uses a ResNet-based generator with residual blocks
38
+
39
+ ## Installation
40
+
41
+ ```bash
42
+ # Clone the repository
43
+ git clone https://huggingface.co/Borcherding/CycleGAN_Depth2RobotsV2_Blend
44
+ cd cycleGAN_Depth2RobotsV2
45
+
46
+ # Install dependencies
47
+ pip install torch torchvision gradio pyvirtualcam
48
+ ```
49
+
50
+ ## Usage Options
51
+
52
+ ### Option 1: Simple Test Interface
53
+
54
+ Run the simple test interface to quickly try out the model:
55
+
56
+ ```bash
57
+ python cycleGANtest.py
58
+ ```
59
+
60
+ This launches a Gradio interface where you can:
61
+ - Upload an image
62
+ - Select conversion direction (Depth to Image or Image to Depth)
63
+ - Transform the image with a single click
64
+
65
+ ### Option 2: Webcam Integration with Depth Estimation
66
+
67
+ For a more advanced setup that includes real-time webcam processing with Depth Anything V2:
68
+
69
+ ```bash
70
+ # Set the path to Depth Anything V2
71
+ export DEPTH_ANYTHING_V2_PATH=/path/to/depth-anything-v2
72
+
73
+ # Run the integrated application
74
+ python discordDepth2AnythingGAN.py
75
+ ```
76
+
77
+ This launches a Gradio interface that allows you to:
78
+ - Capture webcam input
79
+ - Generate depth maps using Depth Anything V2
80
+ - Apply winter-themed colormap to depth maps
81
+ - Apply CycleGAN transformation in either direction
82
+ - Output to a virtual camera for use in video conferencing or streaming
83
+
84
+ ## Using the Model Programmatically
85
+
86
+ ```python
87
+ import torch
88
+ import numpy as np
89
+ import torchvision.transforms as transforms
90
+ from PIL import Image
91
+ from huggingface_hub import hf_hub_download
92
+
93
+ # Define the Generator architecture (as shown in the provided code)
94
+ class ResidualBlock(nn.Module):
95
+ def __init__(self, channels):
96
+ super(ResidualBlock, self).__init__()
97
+ self.conv_block = nn.Sequential(
98
+ nn.ReflectionPad2d(1),
99
+ nn.Conv2d(channels, channels, 3),
100
+ nn.InstanceNorm2d(channels),
101
+ nn.ReLU(inplace=True),
102
+ nn.ReflectionPad2d(1),
103
+ nn.Conv2d(channels, channels, 3),
104
+ nn.InstanceNorm2d(channels)
105
+ )
106
+
107
+ def forward(self, x):
108
+ return x + self.conv_block(x)
109
+
110
+ class Generator(nn.Module):
111
+ def __init__(self, input_channels=3, output_channels=3, n_residual_blocks=9):
112
+ super(Generator, self).__init__()
113
+
114
+ # Initial convolution
115
+ model = [
116
+ nn.ReflectionPad2d(3),
117
+ nn.Conv2d(input_channels, 64, 7),
118
+ nn.InstanceNorm2d(64),
119
+ nn.ReLU(inplace=True)
120
+ ]
121
+
122
+ # Downsampling
123
+ in_features = 64
124
+ out_features = in_features * 2
125
+ for _ in range(2):
126
+ model += [
127
+ nn.Conv2d(in_features, out_features, 3, stride=2, padding=1),
128
+ nn.InstanceNorm2d(out_features),
129
+ nn.ReLU(inplace=True)
130
+ ]
131
+ in_features = out_features
132
+ out_features = in_features * 2
133
+
134
+ # Residual blocks
135
+ for _ in range(n_residual_blocks):
136
+ model += [ResidualBlock(in_features)]
137
+
138
+ # Upsampling
139
+ out_features = in_features // 2
140
+ for _ in range(2):
141
+ model += [
142
+ nn.ConvTranspose2d(in_features, out_features, 3, stride=2, padding=1, output_padding=1),
143
+ nn.InstanceNorm2d(out_features),
144
+ nn.ReLU(inplace=True)
145
+ ]
146
+ in_features = out_features
147
+ out_features = in_features // 2
148
+
149
+ # Output layer
150
+ model += [
151
+ nn.ReflectionPad2d(3),
152
+ nn.Conv2d(64, output_channels, 7),
153
+ nn.Tanh()
154
+ ]
155
+
156
+ self.model = nn.Sequential(*model)
157
+
158
+ def forward(self, x):
159
+ return self.model(x)
160
+
161
+ # Download the model
162
+ def download_model(direction="depth2image"):
163
+ if direction == "depth2image":
164
+ filename = "latest_net_G_A.pth"
165
+ else: # "image2depth"
166
+ filename = "latest_net_G_B.pth"
167
+
168
+ model_path = hf_hub_download(
169
+ repo_id="Borcherding/CycleGAN_Depth2RobotsV2_Blend",
170
+ filename=filename
171
+ )
172
+ return model_path
173
+
174
+ # Image preprocessing
175
+ def preprocess_image(image):
176
+ """
177
+ Preprocess image for model input
178
+
179
+ Args:
180
+ image: PIL Image or numpy array
181
+
182
+ Returns:
183
+ torch.Tensor: Normalized tensor ready for model input
184
+ """
185
+ if isinstance(image, np.ndarray):
186
+ image = Image.fromarray(image.astype('uint8'), 'RGB')
187
+
188
+ transform = transforms.Compose([
189
+ transforms.Resize(256),
190
+ transforms.ToTensor(),
191
+ transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
192
+ ])
193
+
194
+ return transform(image).unsqueeze(0)
195
+
196
+ # Image postprocessing
197
+ def postprocess_image(tensor):
198
+ """
199
+ Convert model output tensor to numpy image
200
+
201
+ Args:
202
+ tensor: Model output tensor
203
+
204
+ Returns:
205
+ numpy.ndarray: RGB image array (0-255)
206
+ """
207
+ tensor = tensor.squeeze(0).cpu()
208
+ tensor = (tensor + 1) / 2
209
+ tensor = tensor.clamp(0, 1)
210
+ tensor = tensor.permute(1, 2, 0).numpy()
211
+ return (tensor * 255).astype(np.uint8)
212
+
213
+ # Example usage
214
+ def transform_image(input_image_path, direction="depth2image"):
215
+ """
216
+ Transform an image using the Depth2Robot model
217
+
218
+ Args:
219
+ input_image_path: Path to input image
220
+ direction: "depth2image" or "image2depth"
221
+
222
+ Returns:
223
+ numpy.ndarray: Transformed image
224
+ """
225
+ # Load model
226
+ model_path = download_model(direction)
227
+ model = Generator()
228
+ model.load_state_dict(torch.load(model_path, map_location='cpu'), strict=False)
229
+ model.eval()
230
+
231
+ # Load and preprocess image
232
+ input_image = Image.open(input_image_path).convert('RGB')
233
+ input_tensor = preprocess_image(input_image)
234
+
235
+ # Generate output
236
+ with torch.no_grad():
237
+ output_tensor = model(input_tensor)
238
+
239
+ # Postprocess output
240
+ output_image = postprocess_image(output_tensor)
241
+
242
+ return output_image
243
+ ```
244
+
245
+ ## Model Checkpoints
246
+
247
+ The model checkpoints are available on Hugging Face:
248
+ - Repository: [Borcherding/Depth2RobotsV2_Annotations](https://huggingface.co/datasets/Borcherding/Depth2RobotsV2_Annotations)
249
+ - Files:
250
+ - `latest_net_G_A.pth` - Generator for Depth to Robot Image transformation
251
+ - `latest_net_G_B.pth` - Generator for Robot Image to Depth transformation
252
+
253
+ ## Integration with Depth Anything V2
254
+
255
+ The integrated application (`discordDepth2AnythingGAN.py`) also leverages [Depth Anything V2](https://github.com/depth-anything/Depth-Anything-V2) for real-time depth estimation, providing a complete pipeline:
256
+
257
+ 1. Capture webcam input
258
+ 2. Generate depth maps with Depth Anything V2
259
+ 3. Apply CycleGAN transformation
260
+ 4. Output to virtual camera
261
+
262
+ ## Requirements
263
+
264
+ - Python 3.7+
265
+ - PyTorch 1.7+
266
+ - torchvision
267
+ - gradio
268
+ - pyvirtualcam (for webcam integration)
269
+ - OpenCV (cv2)
270
+ - Depth Anything V2 (for integrated application)
271
+
272
+ ## License
273
+
274
+ [Insert your license information here]
275
+
276
+ ## Acknowledgments
277
+
278
+ - This model uses CycleGAN architecture from the paper [Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks](https://arxiv.org/abs/1703.10593) by Zhu et al.
279
+ - The implementation is based on [junyanz/pytorch-CycleGAN-and-pix2pix](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix)
280
+ - Integrated application leverages Depth Anything V2 for depth estimation