Abstract
The light problem and the complicated environment of underwater images make target detection difficult. These images are usually blurry because tiny inorganic and organic particles in the water have a great impact on light. To solve this problem, we add squeeze and excitation modules after the deep convolution layers of the YOLOv3 model to learn the relationship between channels and enhance the semantic information of deep features. In addition, many small targets will lose too much information after five downsamples. This is not conducive to detection. By expanding the detection scale, we combine the deep semantic information with the location information of the shallower layer to improve the detection performance of small targets. The experimental results show that the YOLOv3-brackish model greatly improved the detection of small fish, crabs, shrimp and starfish. In addition, there were minor improvements in the detection of big fish and jellyfish. The mean average precision increased by 4.43%.
Similar content being viewed by others
Change history
15 October 2022
A Correction to this paper has been published: https://doi.org/10.1007/s11042-022-14030-4
References
Almero VJD, Concepcion RS, Sybingco E, Dadios EP (2020) An image classifier for underwater fish detection using classification tree-artificial neural network hybrid. In: 2020 RIVF international conference on computing and communication technologies (RIVF), pp 1–6. https://doi.org/10.1109/RIVF48685.2020.9140795
Clausi DA (2002) K-means iterative fisher (kif) unsupervised clustering algorithm applied to image texture segmentation. Pattern Recogn 35(9):1959–1972. https://doi.org/10.1016/S0031-3203(01)00138-8
Girshick R (2015) Fast R-CNN. In: 2015 IEEE international conference on computer vision (ICCV), pp 1440–1448. https://doi.org/10.1109/ICCV.2015.169
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE conference on computer vision and pattern recognition(CVPR), pp 580–587. https://doi.org/10.1109/CVPR.2014.81
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the 14th international conference on artificial intelligence and statistics (AISTATS), pp 315–323
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Tran Pattern Anal Mach Intell 42(8):2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372
Islam MJ, Xia Y, Sattar J (2020) Fast underwater image enhancement for improved visual perception. IEEE Robot Autom Lett 5(2):3227–3234. https://doi.org/10.1109/LRA.2020.2974710
Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. Neural Inf Process Syst. https://doi.org/10.1145/3065386
Li D, Bao J (2018) Research progress on key technologies of underwater robots for aquaculture. Trans Chin Soc Agric Eng 34(16):1–9. https://doi.org/10.11975/j.issn.1002-6819.2018.16.001
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2016) Feature pyramid networks for object detection. http://arxiv.org/abs/1612.03144
Lin W, Zhong J, Liu S, Li T, Li G (2020) Roimix: proposal-fusion among multiple images for underwater object detection. In: ICASSP 2020 - 2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2588–2592. https://doi.org/10.1109/ICASSP40776.2020.9053829
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. In: European Conference on Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-319-46448-0_2
Martin M, Sharma S, Mishra N, Pandey G (2020) UD-ETR based restoration CNN approach for underwater object detection from multimedia data. In: 2nd international conference on data, engineering and applications (IDEA), pp 1–7. https://doi.org/10.1109/IDEA49133.2020.9170740
Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: 18th international conference on pattern recognition (ICPR'06), pp 850–855. https://doi.org/10.1109/ICPR.2006.479
Pedersen M, Haurum JB, Gade R, et al. (2019) Detection of Marine Animals in a New Underwater Dataset with Varying Visibility. In IEEE Conference on Computer Vision and Pattern Recognition Workshops IEEE, Long Beach, United States, 16 – 20 June 2019
Rathi D, Jain S, Indu S (2017) Underwater fish species classification using convolutional neural network and deep learning. In: 2017 ninth international conference on advances in pattern recognition (ICAPR), pp 1–6. https://doi.org/10.1109/ICAPR.2017.8593044
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 779–788. https://doi.org/10.1109/CVPR.2016.91
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: 2017 IEEE conferenceon computer vision and pattern recognition (CVPR), pp 6517–6525. https://doi.org/10.1109/CVPR.2017.690
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. https://arxiv.org/abs/1804.02767
Ren R, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 658–666. https://doi.org/10.1109/CVPR.2019.00075
Xu T, Zhou J, Cai L, Ma Y (2020) Underwater target detection method based on third party information transfer learning. In: 2020 5th international conference on advanced robotics and mechatronics (ICARM), pp 616–621. https://doi.org/10.1109/ICARM49381.2020.9195292
Zhao X, Wang X, Du Z (2020) Research on detection method for the leakage of underwater pipeline by YOLOv3. In: 2020 IEEE international conference on mechatronics and automation (ICMA), pp 637–642. https://doi.org/10.1109/ICMA49215.2020.9233693
Acknowledgements
We thank the anonymous reviewers for their insightful comments. This work was supported by the Key Program of National Natural Science Foundation of China (U2003208), Major science and technology projects in the autonomous region (2020A03004-4) and the real-time underwater specific target autonomous recognition project (2019750001).
Author information
Authors and Affiliations
Contributions
XW contributed to the conception of the study, experiments and papers. LY and ST contributed to the guidance of experiments and papers. PF and XN participated in the experiments and papers and proposed constructive suggestions.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: The Fig. 9 caption in the original publication contains a mistake and the reference referred to is incomplete.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wei, X., Yu, L., Tian, S. et al. Underwater target detection with an attention mechanism and improved scale. Multimed Tools Appl 80, 33747–33761 (2021). https://doi.org/10.1007/s11042-021-11230-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11230-2