File size: 8,033 Bytes
b1ea76e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a0d4eff
b1ea76e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2664315
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
---
license: mit
library_name: transformers
tags:
- Aerial Image Segmentation
- Road Detection
- Semantic Segmentation
- U-Net-50
- Computer Vision
- Remote Sensing
- Urban Planning
- Geographic Information Systems (GIS)
- Deep Learning
datasets:
- balraj98/massachusetts-roads-dataset
---

# Model Card for spectrewolf8/aerial-image-road-segmentation-with-U-NET-xp

This model card provides an overview of a computer vision model designed for aerial image road segmentation using the U-Net-50 architecture. The model is intended to accurately identify and segment road networks from aerial imagery, crucial for applications in mapping and autonomous driving.

## Model Details

### Model Description

- **Developed by:**  [spectrewolf8](https://github.com/Spectrewolf8)
- **Model type:** Computer-Vision/Semantic-segmentation
- **License:** MIT

### Model Sources

- **Repository:** https://github.com/Spectrewolf8/aerial-image-road-segmentation-xp
  
## Uses

### Direct Use

This model can be used to segment road networks from aerial images without additional fine-tuning. It is applicable in scenarios where detailed and accurate road mapping is required.

### Downstream Use 

When fine-tuned on additional datasets, this model can be adapted for other types of semantic segmentation tasks, potentially enhancing applications in various remote sensing domains.

## How to Get Started with the Model

Use the code below to get started with the model.

```python
# Import necessary classes
from tensorflow.keras.models import load_model
from tensorflow.python.keras import layers
from tensorflow.python.keras.models import Sequential

import random
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.preprocessing.image import ImageDataGenerator

seed=24
batch_size= 8

# Load images for dataset generators from respective dataset libraries. The images and masks are returned as NumPy arrays

# Images can be further resized by adding target_size=(150, 150) with any size for your network to flow_from_directory parameters
# Our images are already cropped to 256x256 so traget_size parameter can be ignored

def image_and_mask_generator(image_dir, label_dir):
    img_data_gen_args = dict(rescale = 1/255.)
    mask_data_gen_args = dict()

    image_data_generator = ImageDataGenerator(**img_data_gen_args)
    image_generator = image_data_generator.flow_from_directory(image_dir, 
                                                               seed=seed, 
                                                               batch_size=batch_size,
                                                               classes = ["."],
                                                               class_mode=None #Very important to set this otherwise it returns multiple numpy arrays thinking class mode is binary.
                                                               )  

    mask_data_generator = ImageDataGenerator(**mask_data_gen_args)
    mask_generator = mask_data_generator.flow_from_directory(label_dir, 
                                                             classes = ["."],
                                                             seed=seed, 
                                                             batch_size=batch_size,
                                                             color_mode = 'grayscale', #Read masks in grayscale
                                                             class_mode=None
                                                             )
    # print processed image paths for vanity
    print(image_generator.filenames[0:5])
    print(mask_generator.filenames[0:5])
    
    generator = zip(image_generator, mask_generator)
    return generator

# Method to calculate Intersection over Union Accuracy Coefficient
def iou_coef(y_true, y_pred, smooth=1e-6):
    intersection = tensorflow.reduce_sum(y_true * y_pred)
    union = tensorflow.reduce_sum(y_true) + tensorflow.reduce_sum(y_pred) - intersection
    
    return (intersection + smooth) / (union + smooth)

# Method to calculate Dice Accuracy Coefficient
def dice_coef(y_true, y_pred, smooth=1e-6):
    intersection = tensorflow.reduce_sum(y_true * y_pred)
    total = tensorflow.reduce_sum(y_true) + tensorflow.reduce_sum(y_pred)
    
    return (2. * intersection + smooth) / (total + smooth)

# Method to calculate Dice Loss
def soft_dice_loss(y_true, y_pred):
    return 1-dice_coef(y_true, y_pred)

# Method to create generator
def create_generator(zipped):
    for (img, mask) in zipped:
        yield (img, mask)

model_path = "path"
u_net_model = load_model(model_path, custom_objects={'soft_dice_loss': soft_dice_loss, 'dice_coef': dice_coef, "iou_coef": iou_coef})

test_generator = create_generator(image_and_mask_generator(output_test_image_dir,output_test_label_dir))

# Assuming create_generator is defined and provides images for prediction
images, ground_truth_masks = next(test_generator)

# Make predictions
predictions = u_net_model.predict(images)

# Apply threshold to predictions
thresh_val = 0.8
prediction_threshold = (predictions > thresh_val).astype(np.uint8)

# Visualize results
num_samples = min(10, len(images))  # Use at most 10 samples or the total number of images available
f = plt.figure(figsize=(15, 25))
for i in range(num_samples):
    ix = random.randint(0, len(images) - 1)  # Ensure ix is within range

    f.add_subplot(num_samples, 4, i * 4 + 1)
    plt.imshow(images[ix])
    plt.title("Image")
    plt.axis('off')

    f.add_subplot(num_samples, 4, i * 4 + 2)
    plt.imshow(np.squeeze(ground_truth_masks[ix]))
    plt.title("Ground Truth")
    plt.axis('off')

    f.add_subplot(num_samples, 4, i * 4 + 3)
    plt.imshow(np.squeeze(predictions[ix]))
    plt.title("Prediction")
    plt.axis('off')

    f.add_subplot(num_samples, 4, i * 4 + 4)
    plt.imshow(np.squeeze(prediction_threshold[ix]))
    plt.title(f"Thresholded at {thresh_val}")
    plt.axis('off')

plt.show()

```


## Training Details

### Training Data

The model was trained on the Massachusetts Roads Dataset, which includes high-resolution aerial images with corresponding road segmentation masks. The images were preprocessed by cropping into 256x256 patches and converting masks to binary format.

### Training Procedure

#### Preprocessing

- Images were cropped into 256x256 patches to manage memory usage and improve training efficiency.
- Masks were binarized to create clear road/non-road classifications.

#### Training Hyperparameters

- **Training regime:** FP32 precision
- **Epochs:** 2
- **Batch Size:** 8
- **Learning Rate:** 0.0001

## Evaluation

### Testing Data, Factors & Metrics

#### Testing Data

The model was evaluated using a separate set of aerial images and their corresponding ground truth masks from the dataset.

#### Metrics

- **Intersection over Union (IoU):** Measures the overlap between predicted and actual road areas.
- **Dice Coefficient:** Evaluates the similarity between predicted and ground truth masks.

### Results

The model achieved 71% accuracy in segmenting road networks from aerial images, with evaluation metrics indicating good performance in distinguishing road features from non-road areas.

#### Summary

The U-Net-50 model effectively segments road networks, demonstrating its potential for practical applications in urban planning and autonomous systems.
## Technical Specifications

### Model Architecture and Objective

- **Architecture:** U-Net-50
- **Objective:** Road segmentation in aerial images

### Compute Infrastructure

#### Software

- **Framework:** TensorFlow 2.x
- **Dependencies:** Keras, OpenCV, tifffile

**BibTeX:**

@misc{aerial-image-road-segmentation-with-U-NET-xp,
  author = {spectrewolf8},
  title = {Aerial Image Road Segmentation Using U-Net-50},
  year = {2024},
  howpublished = {\url{https://github.com/Spectrewolf8/aerial-image-road-segmentation-xp}},
}

## Demo

![image/png](https://cdn-uploads.huggingface.co/production/uploads/668d0a0916006f60d0451bd2/4heKUP2xhskHl99MTl8bf.png)