Underwater target detection with an attention mechanism and improved scale

Wei, Xiangyu; Yu, Long; Tian, Shengwei; Feng, Pengcheng; Ning, Xin

doi:10.1007/s11042-021-11230-2

Underwater target detection with an attention mechanism and improved scale

Published: 25 August 2021

Volume 80, pages 33747–33761, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xiangyu Wei¹,
Long Yu ORCID: orcid.org/0000-0001-9041-0801^2,3,
Shengwei Tian^4,5,
Pengcheng Feng¹ &
…
Xin Ning⁶

1007 Accesses
30 Citations
1 Altmetric
Explore all metrics

A Correction to this article was published on 15 October 2022

This article has been updated

Abstract

The light problem and the complicated environment of underwater images make target detection difficult. These images are usually blurry because tiny inorganic and organic particles in the water have a great impact on light. To solve this problem, we add squeeze and excitation modules after the deep convolution layers of the YOLOv3 model to learn the relationship between channels and enhance the semantic information of deep features. In addition, many small targets will lose too much information after five downsamples. This is not conducive to detection. By expanding the detection scale, we combine the deep semantic information with the location information of the shallower layer to improve the detection performance of small targets. The experimental results show that the YOLOv3-brackish model greatly improved the detection of small fish, crabs, shrimp and starfish. In addition, there were minor improvements in the detection of big fish and jellyfish. The mean average precision increased by 4.43%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Underwater object detection based on enhanced YOLOv4 architecture

Article 24 November 2023

Underwater Object Detection Using Restructured SSD

A real-time object detection method for underwater complex environments based on FasterNet-YOLOv7

Article 12 December 2023

Change history

15 October 2022
A Correction to this paper has been published: https://doi.org/10.1007/s11042-022-14030-4

References

Almero VJD, Concepcion RS, Sybingco E, Dadios EP (2020) An image classifier for underwater fish detection using classification tree-artificial neural network hybrid. In: 2020 RIVF international conference on computing and communication technologies (RIVF), pp 1–6. https://doi.org/10.1109/RIVF48685.2020.9140795
Clausi DA (2002) K-means iterative fisher (kif) unsupervised clustering algorithm applied to image texture segmentation. Pattern Recogn 35(9):1959–1972. https://doi.org/10.1016/S0031-3203(01)00138-8
Article MATH Google Scholar
Girshick R (2015) Fast R-CNN. In: 2015 IEEE international conference on computer vision (ICCV), pp 1440–1448. https://doi.org/10.1109/ICCV.2015.169
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE conference on computer vision and pattern recognition(CVPR), pp 580–587. https://doi.org/10.1109/CVPR.2014.81
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the 14th international conference on artificial intelligence and statistics (AISTATS), pp 315–323
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Tran Pattern Anal Mach Intell 42(8):2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372
Article Google Scholar
Islam MJ, Xia Y, Sattar J (2020) Fast underwater image enhancement for improved visual perception. IEEE Robot Autom Lett 5(2):3227–3234. https://doi.org/10.1109/LRA.2020.2974710
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. Neural Inf Process Syst. https://doi.org/10.1145/3065386
Article Google Scholar
Li D, Bao J (2018) Research progress on key technologies of underwater robots for aquaculture. Trans Chin Soc Agric Eng 34(16):1–9. https://doi.org/10.11975/j.issn.1002-6819.2018.16.001
Article Google Scholar
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2016) Feature pyramid networks for object detection. http://arxiv.org/abs/1612.03144
Lin W, Zhong J, Liu S, Li T, Li G (2020) Roimix: proposal-fusion among multiple images for underwater object detection. In: ICASSP 2020 - 2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2588–2592. https://doi.org/10.1109/ICASSP40776.2020.9053829
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. In: European Conference on Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-319-46448-0_2
Martin M, Sharma S, Mishra N, Pandey G (2020) UD-ETR based restoration CNN approach for underwater object detection from multimedia data. In: 2nd international conference on data, engineering and applications (IDEA), pp 1–7. https://doi.org/10.1109/IDEA49133.2020.9170740
Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: 18th international conference on pattern recognition (ICPR'06), pp 850–855. https://doi.org/10.1109/ICPR.2006.479
Pedersen M, Haurum JB, Gade R, et al. (2019) Detection of Marine Animals in a New Underwater Dataset with Varying Visibility. In IEEE Conference on Computer Vision and Pattern Recognition Workshops IEEE, Long Beach, United States, 16 – 20 June 2019
Rathi D, Jain S, Indu S (2017) Underwater fish species classification using convolutional neural network and deep learning. In: 2017 ninth international conference on advances in pattern recognition (ICAPR), pp 1–6. https://doi.org/10.1109/ICAPR.2017.8593044
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 779–788. https://doi.org/10.1109/CVPR.2016.91
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: 2017 IEEE conferenceon computer vision and pattern recognition (CVPR), pp 6517–6525. https://doi.org/10.1109/CVPR.2017.690
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. https://arxiv.org/abs/1804.02767
Ren R, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Article Google Scholar
Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 658–666. https://doi.org/10.1109/CVPR.2019.00075
Xu T, Zhou J, Cai L, Ma Y (2020) Underwater target detection method based on third party information transfer learning. In: 2020 5th international conference on advanced robotics and mechatronics (ICARM), pp 616–621. https://doi.org/10.1109/ICARM49381.2020.9195292
Zhao X, Wang X, Du Z (2020) Research on detection method for the leakage of underwater pipeline by YOLOv3. In: 2020 IEEE international conference on mechatronics and automation (ICMA), pp 637–642. https://doi.org/10.1109/ICMA49215.2020.9233693

Download references

Acknowledgements

We thank the anonymous reviewers for their insightful comments. This work was supported by the Key Program of National Natural Science Foundation of China (U2003208), Major science and technology projects in the autonomous region (2020A03004-4) and the real-time underwater specific target autonomous recognition project (2019750001).

Author information

Authors and Affiliations

College of Information Science and Engineering, Xinjiang University, Urumqi, 830000, China
Xiangyu Wei & Pengcheng Feng
College of Network Center, Xinjiang University, Urumqi, 830000, China
Long Yu
Signal and Signal Processing Laboratory, College of Information Science and Engineering, Xinjiang University, Urumqi, 830000, China
Long Yu
College of Software, Xinjiang University, Urumqi, 830000, China
Shengwei Tian
Key Laboratory of Software Engineering Technology, College of Software, Xin Jiang University, Urumuqi, 830000, China
Shengwei Tian
Institute of Semiconductors, Chinese Academy of Sciences, Beijing, 100000, China
Xin Ning

Authors

Xiangyu Wei
View author publications
You can also search for this author in PubMed Google Scholar
Long Yu
View author publications
You can also search for this author in PubMed Google Scholar
Shengwei Tian
View author publications
You can also search for this author in PubMed Google Scholar
Pengcheng Feng
View author publications
You can also search for this author in PubMed Google Scholar
Xin Ning
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

XW contributed to the conception of the study, experiments and papers. LY and ST contributed to the guidance of experiments and papers. PF and XN participated in the experiments and papers and proposed constructive suggestions.

Corresponding author

Correspondence to Long Yu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: The Fig. 9 caption in the original publication contains a mistake and the reference referred to is incomplete.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wei, X., Yu, L., Tian, S. et al. Underwater target detection with an attention mechanism and improved scale. Multimed Tools Appl 80, 33747–33761 (2021). https://doi.org/10.1007/s11042-021-11230-2

Download citation

Received: 26 November 2020
Revised: 17 January 2021
Accepted: 07 July 2021
Published: 25 August 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s11042-021-11230-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Underwater target detection with an attention mechanism and improved scale

Abstract

Access this article

Similar content being viewed by others

Underwater object detection based on enhanced YOLOv4 architecture

Underwater Object Detection Using Restructured SSD

A real-time object detection method for underwater complex environments based on FasterNet-YOLOv7

Change history

15 October 2022

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Underwater target detection with an attention mechanism and improved scale

Abstract

Access this article

Similar content being viewed by others

Underwater object detection based on enhanced YOLOv4 architecture

Underwater Object Detection Using Restructured SSD

A real-time object detection method for underwater complex environments based on FasterNet-YOLOv7

Change history

15 October 2022

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation