Skip to main content

Advertisement

Log in

Underwater target detection with an attention mechanism and improved scale

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

A Correction to this article was published on 15 October 2022

This article has been updated

Abstract

The light problem and the complicated environment of underwater images make target detection difficult. These images are usually blurry because tiny inorganic and organic particles in the water have a great impact on light. To solve this problem, we add squeeze and excitation modules after the deep convolution layers of the YOLOv3 model to learn the relationship between channels and enhance the semantic information of deep features. In addition, many small targets will lose too much information after five downsamples. This is not conducive to detection. By expanding the detection scale, we combine the deep semantic information with the location information of the shallower layer to improve the detection performance of small targets. The experimental results show that the YOLOv3-brackish model greatly improved the detection of small fish, crabs, shrimp and starfish. In addition, there were minor improvements in the detection of big fish and jellyfish. The mean average precision increased by 4.43%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Change history

References

  1. Almero VJD, Concepcion RS, Sybingco E, Dadios EP (2020) An image classifier for underwater fish detection using classification tree-artificial neural network hybrid. In: 2020 RIVF international conference on computing and communication technologies (RIVF), pp 1–6. https://doi.org/10.1109/RIVF48685.2020.9140795

  2. Clausi DA (2002) K-means iterative fisher (kif) unsupervised clustering algorithm applied to image texture segmentation. Pattern Recogn 35(9):1959–1972. https://doi.org/10.1016/S0031-3203(01)00138-8

    Article  MATH  Google Scholar 

  3. Girshick R (2015) Fast R-CNN. In: 2015 IEEE international conference on computer vision (ICCV), pp 1440–1448. https://doi.org/10.1109/ICCV.2015.169

  4. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE conference on computer vision and pattern recognition(CVPR), pp 580–587. https://doi.org/10.1109/CVPR.2014.81

  5. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the 14th international conference on artificial intelligence and statistics (AISTATS), pp 315–323

  6. Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Tran Pattern Anal Mach Intell 42(8):2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372

    Article  Google Scholar 

  7. Islam MJ, Xia Y, Sattar J (2020) Fast underwater image enhancement for improved visual perception. IEEE Robot Autom Lett 5(2):3227–3234. https://doi.org/10.1109/LRA.2020.2974710

    Article  Google Scholar 

  8. Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. Neural Inf Process Syst. https://doi.org/10.1145/3065386

    Article  Google Scholar 

  9. Li D, Bao J (2018) Research progress on key technologies of underwater robots for aquaculture. Trans Chin Soc Agric Eng 34(16):1–9. https://doi.org/10.11975/j.issn.1002-6819.2018.16.001

    Article  Google Scholar 

  10. Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2016) Feature pyramid networks for object detection. http://arxiv.org/abs/1612.03144

  11. Lin W, Zhong J, Liu S, Li T, Li G (2020) Roimix: proposal-fusion among multiple images for underwater object detection. In: ICASSP 2020 - 2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2588–2592. https://doi.org/10.1109/ICASSP40776.2020.9053829

  12. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. In: European Conference on Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-319-46448-0_2

  13. Martin M, Sharma S, Mishra N, Pandey G (2020) UD-ETR based restoration CNN approach for underwater object detection from multimedia data. In: 2nd international conference on data, engineering and applications (IDEA), pp 1–7. https://doi.org/10.1109/IDEA49133.2020.9170740

  14. Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: 18th international conference on pattern recognition (ICPR'06), pp 850–855. https://doi.org/10.1109/ICPR.2006.479

  15. Pedersen M, Haurum JB, Gade R, et al. (2019) Detection of Marine Animals in a New Underwater Dataset with Varying Visibility.  In IEEE Conference on Computer Vision and Pattern Recognition Workshops IEEE, Long Beach, United States, 16 – 20 June 2019

  16. Rathi D, Jain S, Indu S (2017) Underwater fish species classification using convolutional neural network and deep learning. In: 2017 ninth international conference on advances in pattern recognition (ICAPR), pp 1–6. https://doi.org/10.1109/ICAPR.2017.8593044

  17. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 779–788. https://doi.org/10.1109/CVPR.2016.91

  18. Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: 2017 IEEE conferenceon computer vision and pattern recognition (CVPR), pp 6517–6525. https://doi.org/10.1109/CVPR.2017.690

  19. Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. https://arxiv.org/abs/1804.02767

  20. Ren R, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031

    Article  Google Scholar 

  21. Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 658–666. https://doi.org/10.1109/CVPR.2019.00075

  22. Xu T, Zhou J, Cai L, Ma Y (2020) Underwater target detection method based on third party information transfer learning. In: 2020 5th international conference on advanced robotics and mechatronics (ICARM), pp 616–621. https://doi.org/10.1109/ICARM49381.2020.9195292

  23. Zhao X, Wang X, Du Z (2020) Research on detection method for the leakage of underwater pipeline by YOLOv3. In: 2020 IEEE international conference on mechatronics and automation (ICMA), pp 637–642. https://doi.org/10.1109/ICMA49215.2020.9233693

Download references

Acknowledgements

We thank the anonymous reviewers for their insightful comments. This work was supported by the Key Program of National Natural Science Foundation of China (U2003208), Major science and technology projects in the autonomous region (2020A03004-4) and the real-time underwater specific target autonomous recognition project (2019750001).

Author information

Authors and Affiliations

Authors

Contributions

XW contributed to the conception of the study, experiments and papers. LY and ST contributed to the guidance of experiments and papers. PF and XN participated in the experiments and papers and proposed constructive suggestions.

Corresponding author

Correspondence to Long Yu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: The Fig. 9 caption in the original publication contains a mistake and the reference referred to is incomplete.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wei, X., Yu, L., Tian, S. et al. Underwater target detection with an attention mechanism and improved scale. Multimed Tools Appl 80, 33747–33761 (2021). https://doi.org/10.1007/s11042-021-11230-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11230-2

Keywords

Navigation