Underwater Target Recognition and Localization Model Library
Project Overview
This repository contains a series of deep learning models for underwater target recognition and localization, including MCL/MEG series networks specifically designed for underwater acoustic scenarios, as well as general recognition models migrated from the computer vision field. These models implement underwater target classification and localization based on acoustic signature recognition technology, and can be applied in marine monitoring, underwater security, and other fields.
Model Description
1. Specialized Network Series (Recognition + Localization)
Model Name | Description | Input Features | Function |
---|---|---|---|
MCL | Basic network without mixture-of-experts | GFCC/STFT | Recognition + Localization |
MEG | MCL with added mixture-of-experts model | GFCC/STFT | Recognition + Localization |
MEG_BLC | MEG variant with load balancing mechanism | GFCC/STFT | Recognition + Localization |
MEG_MIX | MEG variant with multi-feature fusion input | Multiple feature fusion | Recognition + Localization |
2. General CV Networks (Recognition Only)
Classic models migrated from the computer vision field, adapted for underwater acoustic signature recognition tasks:
- DenseNet121
- MobileNetV2
- ResNet18
- ResNet50
- Swin-Transformer
Performance Metrics
Network | ACC(%) | MAE-R (km) | MAE-D (m) |
---|---|---|---|
MEG (STFT) | 95.93 | 0.2011 | 20.61 |
MCL (STFT) | 96.07 | 0.2565 | 27.68 |
MEG(GFCC) | 95.75 | 0.1707 | 19.43 |
MCL(GFCC) | 96.10 | 0.3384 | 35.42 |
densenet121 | 86.61 | - | - |
resnet18 | 84.99 | - | - |
mobilenetv2 | 83.60 | - | - |
resnet50 | 76.34 | - | - |
swin-transformer | 63.08 | - | - |
Note: ACC is recognition accuracy, MAE-R is mean absolute error for range localization, MAE-D is mean absolute error for depth localization
Usage Instructions
1. Model Download
Model weight files can be downloaded from Hugging Face Hub or ModelScope. Complete project code is available through the following links:
- Gitee:
- GitHub:
2. Model Usage
Use the --resume hyperparameter to specify the folder containing weight files, defaulting to loading model.pth
python train_mtl.py --features stft --task_type mtl --resume './models/meg(stft)'
3. Input and Output
- Input: Acoustic features (GFCC/STFT, etc.)
- Output: Target category, range estimation, depth estimation For detailed input/output formats and training/inference code, please refer to the project repository documentation.
Citation Information
The related research paper is under review and is expected to be published in MDPI's Remote Sensing journal in September 2025. If using models from this project, please cite the following paper (to be updated after publication):
@article{uwtrl2025,
title={Multi-Task Mixture-of-Experts Model for Underwater Target Localization and Recognition},
author={Peng Qian, Jingyi Wang, Yining Liu, Yingxuan Chen, Pengjiu Wang, Yanfa Deng, Peng Xiao* and Zhenglin Li},
journal={Remote Sensing},
year={2025},
publisher={MDPI}
}
Contact Information
For questions or collaboration inquiries, please contact: [[email protected]]
This project is for academic research use only. For commercial use, please contact the authors for authorization.
Model tree for peng7554/UWTRL-MEG
Base model
microsoft/resnet-50