Underwater Target Recognition and Localization Model Library

Project Overview

This repository contains a series of deep learning models for underwater target recognition and localization, including MCL/MEG series networks specifically designed for underwater acoustic scenarios, as well as general recognition models migrated from the computer vision field. These models implement underwater target classification and localization based on acoustic signature recognition technology, and can be applied in marine monitoring, underwater security, and other fields.

Model Description

1. Specialized Network Series (Recognition + Localization)

Model Name	Description	Input Features	Function
MCL	Basic network without mixture-of-experts	GFCC/STFT	Recognition + Localization
MEG	MCL with added mixture-of-experts model	GFCC/STFT	Recognition + Localization
MEG_BLC	MEG variant with load balancing mechanism	GFCC/STFT	Recognition + Localization
MEG_MIX	MEG variant with multi-feature fusion input	Multiple feature fusion	Recognition + Localization

2. General CV Networks (Recognition Only)

Classic models migrated from the computer vision field, adapted for underwater acoustic signature recognition tasks:

DenseNet121
MobileNetV2
ResNet18
ResNet50
Swin-Transformer

Performance Metrics

Network	ACC(%)	MAE-R (km)	MAE-D (m)
MEG (STFT)	95.93	0.2011	20.61
MCL (STFT)	96.07	0.2565	27.68
MEG(GFCC)	95.75	0.1707	19.43
MCL(GFCC)	96.10	0.3384	35.42
densenet121	86.61	-	-
resnet18	84.99	-	-
mobilenetv2	83.60	-	-
resnet50	76.34	-	-
swin-transformer	63.08	-	-

Note: ACC is recognition accuracy, MAE-R is mean absolute error for range localization, MAE-D is mean absolute error for depth localization

Usage Instructions

1. Model Download

Model weight files can be downloaded from Hugging Face Hub or ModelScope. Complete project code is available through the following links:

Gitee:
GitHub:

2. Model Usage

Use the --resume hyperparameter to specify the folder containing weight files, defaulting to loading model.pth

python train_mtl.py --features stft --task_type mtl --resume './models/meg(stft)'

3. Input and Output

Input: Acoustic features (GFCC/STFT, etc.)
Output: Target category, range estimation, depth estimation For detailed input/output formats and training/inference code, please refer to the project repository documentation.

Citation Information

The related research paper is under review and is expected to be published in MDPI's Remote Sensing journal in September 2025. If using models from this project, please cite the following paper (to be updated after publication):

@article{uwtrl2025,
  title={Multi-Task Mixture-of-Experts Model for Underwater Target Localization and Recognition},
  author={Peng Qian, Jingyi Wang, Yining Liu, Yingxuan Chen, Pengjiu Wang, Yanfa Deng, Peng Xiao* and Zhenglin Li},
  journal={Remote Sensing},
  year={2025},
  publisher={MDPI}
}

Contact Information

For questions or collaboration inquiries, please contact: [[email protected]]

This project is for academic research use only. For commercial use, please contact the authors for authorization.

peng7554
/

UWTRL-MEG