--- license: mit library_name: transformers tags: - Aerial Image Segmentation - Road Detection - Semantic Segmentation - U-Net-50 - Computer Vision - Remote Sensing - Urban Planning - Geographic Information Systems (GIS) - Deep Learning datasets: - balraj98/massachusetts-roads-dataset --- # Model Card for spectrewolf8/aerial-image-road-segmentation-with-U-NET-xp This model card provides an overview of a computer vision model designed for aerial image road segmentation using the U-Net-50 architecture. The model is intended to accurately identify and segment road networks from aerial imagery, crucial for applications in mapping and autonomous driving. ## Model Details ### Model Description - **Developed by:** [spectrewolf8](https://github.com/Spectrewolf8) - **Model type:** Computer-Vision/Semantic-segmentation - **License:** MIT ### Model Sources - **Repository:** https://github.com/Spectrewolf8/aerial-image-road-segmentation-xp ## Uses ### Direct Use This model can be used to segment road networks from aerial images without additional fine-tuning. It is applicable in scenarios where detailed and accurate road mapping is required. ### Downstream Use When fine-tuned on additional datasets, this model can be adapted for other types of semantic segmentation tasks, potentially enhancing applications in various remote sensing domains. ## How to Get Started with the Model Use the code below to get started with the model. ```python # Import necessary classes from tensorflow.keras.models import load_model from tensorflow.python.keras import layers from tensorflow.python.keras.models import Sequential import random import numpy as np import matplotlib.pyplot as plt from tensorflow.keras.preprocessing.image import ImageDataGenerator seed=24 batch_size= 8 # Load images for dataset generators from respective dataset libraries. The images and masks are returned as NumPy arrays # Images can be further resized by adding target_size=(150, 150) with any size for your network to flow_from_directory parameters # Our images are already cropped to 256x256 so traget_size parameter can be ignored def image_and_mask_generator(image_dir, label_dir): img_data_gen_args = dict(rescale = 1/255.) mask_data_gen_args = dict() image_data_generator = ImageDataGenerator(**img_data_gen_args) image_generator = image_data_generator.flow_from_directory(image_dir, seed=seed, batch_size=batch_size, classes = ["."], class_mode=None #Very important to set this otherwise it returns multiple numpy arrays thinking class mode is binary. ) mask_data_generator = ImageDataGenerator(**mask_data_gen_args) mask_generator = mask_data_generator.flow_from_directory(label_dir, classes = ["."], seed=seed, batch_size=batch_size, color_mode = 'grayscale', #Read masks in grayscale class_mode=None ) # print processed image paths for vanity print(image_generator.filenames[0:5]) print(mask_generator.filenames[0:5]) generator = zip(image_generator, mask_generator) return generator # Method to calculate Intersection over Union Accuracy Coefficient def iou_coef(y_true, y_pred, smooth=1e-6): intersection = tensorflow.reduce_sum(y_true * y_pred) union = tensorflow.reduce_sum(y_true) + tensorflow.reduce_sum(y_pred) - intersection return (intersection + smooth) / (union + smooth) # Method to calculate Dice Accuracy Coefficient def dice_coef(y_true, y_pred, smooth=1e-6): intersection = tensorflow.reduce_sum(y_true * y_pred) total = tensorflow.reduce_sum(y_true) + tensorflow.reduce_sum(y_pred) return (2. * intersection + smooth) / (total + smooth) # Method to calculate Dice Loss def soft_dice_loss(y_true, y_pred): return 1-dice_coef(y_true, y_pred) # Method to create generator def create_generator(zipped): for (img, mask) in zipped: yield (img, mask) model_path = "path" u_net_model = load_model(model_path, custom_objects={'soft_dice_loss': soft_dice_loss, 'dice_coef': dice_coef, "iou_coef": iou_coef}) test_generator = create_generator(image_and_mask_generator(output_test_image_dir,output_test_label_dir)) # Assuming create_generator is defined and provides images for prediction images, ground_truth_masks = next(test_generator) # Make predictions predictions = u_net_model.predict(images) # Apply threshold to predictions thresh_val = 0.8 prediction_threshold = (predictions > thresh_val).astype(np.uint8) # Visualize results num_samples = min(10, len(images)) # Use at most 10 samples or the total number of images available f = plt.figure(figsize=(15, 25)) for i in range(num_samples): ix = random.randint(0, len(images) - 1) # Ensure ix is within range f.add_subplot(num_samples, 4, i * 4 + 1) plt.imshow(images[ix]) plt.title("Image") plt.axis('off') f.add_subplot(num_samples, 4, i * 4 + 2) plt.imshow(np.squeeze(ground_truth_masks[ix])) plt.title("Ground Truth") plt.axis('off') f.add_subplot(num_samples, 4, i * 4 + 3) plt.imshow(np.squeeze(predictions[ix])) plt.title("Prediction") plt.axis('off') f.add_subplot(num_samples, 4, i * 4 + 4) plt.imshow(np.squeeze(prediction_threshold[ix])) plt.title(f"Thresholded at {thresh_val}") plt.axis('off') plt.show() ``` ## Training Details ### Training Data The model was trained on the Massachusetts Roads Dataset, which includes high-resolution aerial images with corresponding road segmentation masks. The images were preprocessed by cropping into 256x256 patches and converting masks to binary format. ### Training Procedure #### Preprocessing - Images were cropped into 256x256 patches to manage memory usage and improve training efficiency. - Masks were binarized to create clear road/non-road classifications. #### Training Hyperparameters - **Training regime:** FP32 precision - **Epochs:** 2 - **Batch Size:** 8 - **Learning Rate:** 0.0001 ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data The model was evaluated using a separate set of aerial images and their corresponding ground truth masks from the dataset. #### Metrics - **Intersection over Union (IoU):** Measures the overlap between predicted and actual road areas. - **Dice Coefficient:** Evaluates the similarity between predicted and ground truth masks. ### Results The model achieved 71% accuracy in segmenting road networks from aerial images, with evaluation metrics indicating good performance in distinguishing road features from non-road areas. #### Summary The U-Net-50 model effectively segments road networks, demonstrating its potential for practical applications in urban planning and autonomous systems. ## Technical Specifications ### Model Architecture and Objective - **Architecture:** U-Net-50 - **Objective:** Road segmentation in aerial images ### Compute Infrastructure #### Software - **Framework:** TensorFlow 2.x - **Dependencies:** Keras, OpenCV, tifffile **BibTeX:** @misc{aerial-image-road-segmentation-with-U-NET-xp, author = {spectrewolf8}, title = {Aerial Image Road Segmentation Using U-Net-50}, year = {2024}, howpublished = {\url{https://github.com/Spectrewolf8/aerial-image-road-segmentation-xp}}, } ## Demo ![image/png](https://cdn-uploads.huggingface.co/production/uploads/668d0a0916006f60d0451bd2/4heKUP2xhskHl99MTl8bf.png)