Skip to main content

Advertisement

Log in

Underwater object detection: architectures and algorithms – a comprehensive review

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Underwater object detection is an essential step in image processing and it plays a vital role in several applications such as the repair and maintenance of sub-aquatic structures and marine sciences. Many computer vision-based solutions have been proposed but an optimal solution for underwater object detection and species classification does not exist. This is mainly because of the challenges presented by the underwater environment which mainly include light scattering and light absorption. The advent of deep learning has enabled researchers to solve various problems like protection of the subaquatic ecological environment, emergency rescue, reducing chances of underwater disaster and its prevention, underwater target detection, spooring, and recognition. However, the advantages and shortcomings of these deep learning algorithms are still unclear. Thus, to give a clearer view of the underwater object detection algorithms and their pros and cons, we proffer a state-of-the-art review of different computer vision-based approaches that have been developed as yet. Besides, a comparison of various state-of-the-art schemes is made based on various objective indices and future research directions in the field of underwater object detection have also been proffered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30

Similar content being viewed by others

References

  1. Abdel-Maksoud E, Elmogy M, Al-Awadi R (2015) Brain tumour segmentation based on a hybrid clustering technique. Egypt Inform J Cairo Univ 16(1):71–81

    Article  Google Scholar 

  2. About ImageNet (n.d.) http://image-net.org/about-overview. Accessed 8/7/21

  3. About ImageNet (n.d.) http://image-net.org/about-overview. Accessed 8/7/21

  4. Alberto GG, Sergio OE, Sergiu O, Victor VM, Jose GR (2017) A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857

    Google Scholar 

  5. Amanda D, Felipe C, Joel G, Silvia B (2016) A dataset to evaluate underwater image restoration methods. In: IEEE OCEANS 2016-Shanghai, pp. 1–6, https://doi.org/10.1109/OCEANSAP.2016.7485524

  6. Aquino G, Rubio JDJ, Pacheco J, Gutierrez GJ, Ochoa G, Balcazar R, Cruz DR, Garcia E, Novoa JF, Zacarias A (2020) Novel nonlinear hypothesis for the delta parallel robot modelling. IEEE Access 8:46324–46334. https://doi.org/10.1109/ACCESS.2020.2979141

    Article  Google Scholar 

  7. BBC News (2016) Artificial intelligence: Google’s AlphaGo beats Go master Lee Se-dol

  8. Beijbom O, Edmunds PJ, Kline DI, Mitchell BG, Kriegman D (2012) Automated annotation of coral reef survey images. In: IEEE computer society conference on computer vision and pattern recognition, IEEE Press: Rhode Island, pp. 1170–1177

  9. Bell RM, Koren Y (2007) Lessons from the netflix prize challenge. ACM Sigkdd Explor Newslett 9(2):75–79

    Article  Google Scholar 

  10. Biswas S, Wang Y, Cui S (2015) Surgically altered face detection using log-gabor wavelet. In: Proceedings of the 12th international computer conference on wavelet active media technology and information processing (ICCWAMTIP), IEEE, Chengdu, China, pp. 154–157

  11. Bochkovskiy A, Wang C-Y, Liao M et al (2020) YOLOv4: Optimal Speed and accuracy of object detection. arXiv:2004.10934v1

    Google Scholar 

  12. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2015) Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: ICLR

  13. Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell (TPAMI) 40(4):834–848

    Article  Google Scholar 

  14. Chen Y, Yang T, Zhang X, Meng G, Xiao X, Sun J (2019) DetNAS: backbone search for object detection. In: Advances in neural information processing systems (NeurIPS), pp. 6638–6648

  15. Chen L, Liu Z, Tong L, et al. (2020) Underwater object detection using invert multi-class adaboost with deep learning. IEEE explore, Auckland University of Technology

  16. Chiang H-S, Chen M-Y, Huang Y-J (2019) Wavelet-based EEG processing for epilepsy detection using fuzzy entropy and associative petri net. IEEE Access 7:103255–103262. https://doi.org/10.1109/ACCESS.2019.2929266

    Article  Google Scholar 

  17. Chien C-F, Chen Y-J, Han Y-T, et al. (2018) AI and big data analytics for wafer fab energy saving and chiller optimization to empower intelligent manufacturing. In: Proceedings of 2018 e-Manufacturing & Design Collaboration Symposium (eMDC), IEEE, Hsinchu, Taiwan, pp. 1–4

  18. Ciregan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. 2012 IEEE conference on computer vision and pattern recognition, Providence, RI, pp. 3642–3649. https://doi.org/10.1109/CVPR.2012.6248110

  19. Cui S, Ekwonnah O, Wang Y (2018) The analysis of emotions over keystroke dynamics. In: 2018 international conference on information, electronic and communication engineering (IECE2018), DEStech Publications, Beijing, China, pp. 28–29

  20. Cui S, Wang Y, Ekwonnah O (2019) Keystroke dynamics on user authentication. In: 4th international conference on cybernetics (CYBCONF), Beijing, China, pp. 5–7

  21. Cui S, Zhou Y, Wang Y, Zhai L (2019) Fish detection using deep learning. Appl Comput Intell Soft Comput 13:Article ID 3738108. https://doi.org/10.1155/2020/3738108

    Article  Google Scholar 

  22. Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems (NIPS), pp. 379–387

  23. Deng L, Yu D (2013) Deep learning: methods and applications. Found Trends Signal Process 7(3–4):197–387. https://doi.org/10.1561/2000000039

    Article  MathSciNet  MATH  Google Scholar 

  24. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 248–255. https://doi.org/10.1109/CVPR.2009.5206848

  25. Drews Jr P, Edo N, Moraes F, Botelho S, Campos M (2013) Transmission estimation in underwater single images. In: 2013 IEEE International Conference on Computer Vision Workshops. https://doi.org/10.1109/ICCVW.2013.113

  26. Drews P Jr, Nascimento ER, Botelho S, Campos M (2016) Underwater depth estimation and image restoration based on single images. IEEE Comput Graph Appl 36:24–35. https://doi.org/10.1109/MCG.2016.26

    Article  Google Scholar 

  27. Du X, Lin T-Y, Jin P, Ghiasi G, Tan M, Cui Y et al (2019) SpineNet: learning scale-permuted backbone for recognition and localization. arXiv preprint arXiv:1912.05027

    Google Scholar 

  28. Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) CenterNet: Keypoint triplets for object detection. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp. 6569–6578. https://doi.org/10.1109/ICCV.2019.00667

  29. Ducournau A, Fablet R (2016) Deep learning for ocean remote sensing: An application of convolutional neural networks for super-resolution on satellite-derived SST data. In: 9th workshop on pattern recognition in remote sensing, pp. 1-6. https://doi.org/10.1109/PRRS.2016.7867019

  30. Duo Z, Wang W, Wang H (2019) Oceanic mesoscale eddy detection method based on deep learning. Remote Sens 11(16):1921

    Article  Google Scholar 

  31. Elawady M (2014) Sparsem: coral classification using deep convolutional neural networks. M.sc. thesis, Hariot-Watt University

  32. Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using deep neural networks. In: Computer Vision and Pattern Recognition (CVPR), 2014 IEEE conference, pp. 2155–2162. https://doi.org/10.1109/CVPR.2014.276

  33. Federico F, Elsa R, Humberto S, Víctor P (2020) CNN based detectors on planetary environments: a performance evaluation. Front Neurorobot 14:590371(ISSN 1662-5218. https://doi.org/10.3389/fnbot.2020.590371

    Article  Google Scholar 

  34. Fu CY, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659

    Google Scholar 

  35. Garcia J, Fernandez J, Sanz P, Marin R (2010) Increasing autonomy within underwater intervention scenarios: the user interface approach. In: Proceedings of the IEEE systems conference, pp. 71-75

  36. Ghani ASA, Isa NAM (2015) Enhancement of low-quality underwater image through integrated global and local contrast correction. Appl Soft Comput 37(C):332–344. https://doi.org/10.1016/j.asoc.2015.08.033

    Article  Google Scholar 

  37. Ghiasi G, Lin T-Y, Le QV (2019) Learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7036–7045

  38. Girshick R (2015) Fast R-CNN. In: 2015 IEEE international conference on computer vision (ICCV), IEEE, Santiago, Chile, pp. 1440–1448

  39. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 580–587, https://doi.org/10.1109/CVPR.2014.81.

  40. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on computer Vision & Pattern Recognition, IEEE, Columbus, OH, USA

  41. Guo J, Han K, Wang Y, Zhang C, Yang Z, Wu H, Chen X, Xu C (2020) HitDetector: hierarchical trinity architecture search for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  42. Han F, Yao J, Zhu H, Wang C (2020) Underwater image processing and object detection based on deep CNN method. J Sensors 20:Article ID 6707328. https://doi.org/10.1155/2020/6707328

    Article  Google Scholar 

  43. Han K, Wang Y, Tian Q, et al. (2020) GhostNet: more features from cheap operations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR42600.2020.00165

  44. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell (TPAMI) 37(9):1904–1916

    Article  Google Scholar 

  45. He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on Imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034. https://doi.org/10.1109/ICCV.2015.123

  46. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

  47. He K, Gkioxari G, Dollar P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2961–2969

  48. Hinton GE, Osindero S (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MathSciNet  MATH  Google Scholar 

  49. Hoang Ngan Le T, Zheng Y, Zhu C, Luu K, Savvides M (2016) Mulitple scale Faster R-CNN approach to diver’s cell-phone usage and hands on steering wheel detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 46–53

  50. Hoiem D, Chodpathumwan Y, Dai Q (2012) Diagnosing error in object detectors. In; Computer Vision–ECCV, Springer, pp. 340–353

  51. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T et al (2017) MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861

    Google Scholar 

  52. Howard A, Sandler M, Chu G, Chen L-C, Chen B, Tan M et al (2019) Searching for MobileNetV3. In: Proceedings of the IEEE international conference on computer vision (ICCV). https://arxiv.org/abs/1905.02244. Accessed 10/7/21

  53. Hu M, Yang Y, Shen F, Zhang L, Shen H, Li X (2017) Robust web image annotation via exploring multi-facet and structural knowledge. IEEE Trans Image Process 26(10):4871–4884

    Article  MathSciNet  Google Scholar 

  54. Huang G, Liu Z, Maaten LVD, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 4700–4708, arXiv:1608.06993

  55. Huang H, Zhou H, Yang X, Zhang L, Qi L, Zang A-Y (2019) Faster RCNN for marine organisms detection and recognition using data augmentation. Neurocomputing 337:372–384

    Article  Google Scholar 

  56. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167

    Google Scholar 

  57. Jia J (2009) A machine vision application for industrial assembly inspection. In: Proceedings of 2009 second international conference on machine vision, IEEE, Dubai, UAE, pp. 172–176

  58. Knausgard KM, Wiklund A, Sørdalen TK et al (2020) Temperate fish detection and classification: a deep learning based approach. Appl Intell Manuscr. https://doi.org/10.1007/s10489-020-02154-9

  59. Koreitem K, Girdhar Y, Cho W, Singh H, Pineda J, Dudek G (2016) Subsea fauna enumeration using vision-based marine robots. In: Proceedings of the IEEE conference on computer and robot vision, pp. 101-108

  60. Krasin I, Duerig T, Alldrin N, et al. (2017) Openimages: A public dataset for large-scale multi-label and multi-class image classification. Dataset available from https://github.com/openimages. Accessed 10/7/21

  61. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, vol.1 (NIPS), Lake Tahoe, Nevada, 25(2):1097–1105. https://doi.org/10.5555/2999134.2999257

  62. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  63. Landola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and 0.5 MB model size. arXiv preprint arXiv:1602.07360

    Google Scholar 

  64. Lavery PS, McMahon K, Weyers J, Boyce MC, Oldham CE (2013) Release of dissolved organic carbon from seagrass wrack and its implications for trophic connectivity. Mar Ecol Prog Ser 494:121–133

    Article  Google Scholar 

  65. Law H, Deng J (2018) “CornerNet: Detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 734–750

  66. Law H, Teng Y, Russakovsky O, Deng J (2019) CornerNet-lite: efficient keypoint based object detection. arXiv preprint arXiv:1904.08900

    Google Scholar 

  67. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  68. Li Z, Peng C, Yu G, Zhang X, Deng Y, Sun J (2018) DetNet: design backbone for object detection. In: Proceedings of the European conference on computer vision (ECCV), pp. 334–350

  69. Lin T-Y, Maire M, Belongie S, et al. (2014) Microsoft COCO: common objects in context. In: Proceedings of the European conference on computer vision (ECCV), pp. 740–755

  70. Lin T-Y, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp. 2980–2988. https://doi.org/10.1109/ICCV.2017.324

  71. Lin T-Y, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 2117–2125. https://doi.org/10.1109/CVPR.2017.106

  72. Liu M (n.d.) AU-aware deep networks for facial expression recognition. In: 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, IEEE Press, Shanghai, pp. 1–6. https://doi.org/10.1109/FG.2013.6553734

  73. Liu Z, Zhang Y, Yu X, Yuan C (2016) Unmanned surface vehicles: An overview of development and challenges. Annu Rev Control 41:71–93. https://doi.org/10.1016/j.arcontrol.2016.04.018

    Article  Google Scholar 

  74. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: Proceedings of the European conference on computer vision (ECCV), pp. 21–37

  75. Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 8759–8768. https://doi.org/10.1109/CVPR.2018.00913

  76. Liu S, Huang D, et al. (2018) Receptive field block net for accurate and fast object detection. In: Proceedings of the European conference on computer vision (ECCV), pp. 385–400

  77. Liu S, Li X, Gao M, Cai Y, Nian R, Li P, et al. (2018) Embedded online fish detection and tracking system via yolo-v3 and parallel correlation filter. In: OCEANS 2018 MTS/IEEE Charleston. IEEE, pp 1–6

  78. Liu S, Huang D, Wang Y (2019) Learning spatial fusion for single-shot object detection. arXiv preprint arXiv:1911.09516

    Google Scholar 

  79. Lu H, Li Y, Zhang Y, Chen M, Serikawa S, Kim H (2017) Underwater optical image processing: a comprehensive review. Mobile Netw Appl 22:1202–1211. https://doi.org/10.1007/s11036-017-0863-4

    Article  Google Scholar 

  80. Luo Y, Wan Y (2013) A novel efficient method for training sparse auto-encoders. In: 2103 6th international congress on image and signal processing, pp. 1019-1023

  81. Ma N, Zhang X, Zheng H-T, Sun J (2018) ShuffleNetV2: practical guidelines for efficient cnn architecture design. In: Proceedings of the European conference on computer vision (ECCV), pp. 116–131. https://arxiv.org/abs/1807.11164. Accessed 10/7/21

  82. Mahmood A, Bennamoun M, An S, Sohel F, Boussaid F, Hovey R, Kendrick G, Fisher RB (2016) Coral classification with hybrid feature representations. In: IEEE International Conference on Image Processing, IEEE Press, Arizona, pp. 519–523

  83. Marr B (n.d.) Key milestones of Waymo – Google’s self-driving cars. https://www.forbes.com/sites/bernardmarr/2018/09/21/keymilestones-of-waymo-googles-self-driving-cars/#3831b2965369. Accessed 18/12/2021

  84. Meda-Campaña JA (2018) On the estimation and control of nonlinear systems with parametric uncertainties and noisy outputs. IEEE Access 6:31968–31973. https://doi.org/10.1109/ACCESS.2018.2846483

    Article  Google Scholar 

  85. Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp. 807–814

  86. Online Available at (n.d.) “https://www.pureadvantage.org/news/2016/11/15/underwater-robots”. Accessed 18/12/2021

  87. Online Available at (n.d.) “https://www.ecagroup.com/en/solutions/h800-rov-remotely-operated-vehicle”. Accessed 18/12/2021

  88. Pang J, Chen K, Shi J, Feng H, Ouyang W, Libra LD (2019) R-CNN: towards balanced learning for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp., 821–830

  89. Qin H, Li X, Yang Z, Shang M (2015) When underwater imagery analysis meets deep learning: a solution at the age of big visual data. In: OCEANS 2015 - MTS/IEEE Washington, IEEE Press, Washington DC, pp. 1–5. https://doi.org/10.23919/OCEANS.2015.7404463

  90. Rashwan A, Kalra A, Poupart P (2019) Matrix nets: a new deep architecture for object detection. In: Proceedings of the IEEE international conference on computer vision workshop (ICCV workshop), pp. 2025–2028. https://doi.org/10.1109/ICCVW.2019.00252

  91. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7263–7271. https://doi.org/10.1109/CVPR.2017.690

  92. Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Computer vision and pattern recognition (CVPR). 2017 IEEE conference, pp. 6517–6525. https://doi.org/10.1109/CVPR.2017.690

  93. Redmon J, Farhadi A (2018) YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767

    Google Scholar 

  94. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788. https://doi.org/10.1109/CVPR.2016.91

  95. Ren S, He K, Girshick R, Sun J (2015) Faster RCNN: towards real-time object detection with region proposal networks. In: Proceedings of the advances in neural information processing systems, pp. 91-99. https://doi.org/10.1109/TPAMI.2016.2577031

  96. Rifai S, Vincent P, Muller X, Glorot X, Bengio Y (2011) Contractive autoencoders: explicit invariance during feature extraction. In: Proceedings of the 28th international conference on international conference on machine learning, pp. 833–840

  97. Rubio J (2009) SOFMLS: online self-organizing fuzzy modified least-squares network. IEEE Trans Fuzzy Syst 17(6):1296–1309. https://doi.org/10.1109/TFUZZ.2009.2029569

    Article  Google Scholar 

  98. Rubio J (n.d.) Stability analysis of the modified Levenberg-Marquardt algorithm for the artificial neural network training. In: IEEE Transactions on Neural Networks and Learning Systems, https://doi.org/10.1109/TNNLS.2020.3015200

  99. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV) 115:211–252. https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  100. Sa I, Ge Z, Dayoub F, Upcroft B, Perez T, McCool C (2016) Deepfruits: a fruit detection system using deep neural networks. Sensors 16(8)

  101. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedingsof the IEEE conference on computer vision and pattern recognition (CVPR), pp. 4510–4520

  102. Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117

    Article  Google Scholar 

  103. Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2014) Overfeat: integrated recognition, localization and detection using convolutional networks. Preprint arXiv:1312.6229

    Google Scholar 

  104. Sharma P (2018) A step-by-step introduction to the basic object detection algorithms part 1. Analytics Vidhya, Introduction

  105. Sharma A, Singh PK, Khurana P (2016) Analytical review on object segmentation and recognition. In: 2016 6th international conference – cloud system and big data engineering (confluence), pp. 524-530, https://doi.org/10.1109/CONFLUENCE.2016.7508176

  106. Simonyan K, Zisserman A (n.d.) Very deep convolutional networks for large scale image recognition. In: IEEE International Conference on Learning Representations, pp. 730–734. https://arxiv.org/abs/1409.1556. Accessed 18/12/2021

  107. Song HA, Lee S-Y (2013) Hierarchical representation using NMF. In: Lee M, Hirose A, Hou Z-G, Kil RM (eds) ICONIP 2013. LNCS, 8226:466–473. Springer, Heidelberg. https://doi.org/10.1007/978-3-642-42054-2_58

    Chapter  Google Scholar 

  108. Sun F, Yu J, Chen S, Xu D (2014) Active visual tracking of free-swimming robotic fish based on automatic recognition. In: Proceedings of the IEEE conference on intelligent control and automation, pp. 2879-2884

  109. Sung M, Yu S-C, Girdhar Y (2017) Vision based real-time fish detection using convolutional neural network. In: OCEANS 2017 – Aberdeen. IEEE, pp. 1-6, https://doi.org/10.1109/OCEANSE.2017.8084889

  110. Swart S, Zietsman J, Coetzee J, Goslett D, Hoek A, Needham D, Monteiro P (2016) Ocean robotics in support of fisheries research and management. Afr J Mar Sci 38(4):525–538

    Article  Google Scholar 

  111. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, et al. (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9. https://doi.org/10.1109/CVPR.2015.7298594

  112. Szegedy C, Ioffe S, Vanhoucke V (2016) Inception-v4, inception-resnet and the impact of residual connections on learning. CoRR, abs/1602.07261

  113. Takagi M, Mori H, Yimit A, Hagihara Y, Miyoshi T (2016) Development of a small size underwater robot for observing fisheries resources-underwater robot for assisting abalone fishing. J Robot Mechatron 28(3):397–403

    Article  Google Scholar 

  114. Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. In: Proceedings of international conference on machine learning (ICML)

  115. Tan M, Le QV (2019) MixNet: mixed depthwise convolutional kernels. In proceedings of the British machine vision conference (BMVC)

  116. Tan M, Chen B, Pang R, Vasudevan V, Sandler M, Howard A, Le, QV (2019) MNASnet: Platform-aware neural architecture search for mobile. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2820–2828

  117. Tan M, Pang R, Le QV (2020) EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR42600.2020.01079

  118. Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp. 9627–9636. https://doi.org/10.1109/ICCV.2019.00972

  119. Villon S, Chaumont M, Subsol G, Villéger S, Claverie T, Mouillot D (2016) Coral reef fish detection and recognition in underwater videos by supervised machine learning: comparison between deep learning and HOG+SVM methods. ACIVS: advanced concepts for intelligent vision systems, Lecce, Italy. ffhal-01374123

  120. Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103

  121. Wang Y, Lan Y, Zheng Y, Lee K, Cui S, Lian J (2013) A UGV based laser scanner system for measuring tree geometric characteristics. In: Proceedings of 2013 SPIE International Symposium on Photoelectronic Detection and Imaging, SPIE, Beijing China, vol. 8905

  122. Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) CSPNet: a new backbone that can enhance learning capability of cnn. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPR Workshop). https://doi.org/10.1109/CVPRW50498.2020.00203

  123. Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp. 3–19

  124. Xavier G, Antoine B, Yoshua B (2011) Deep sparse rectifier neural networks. In; Proceedings of the 14th international conference on artificial intelligence and statistics, pp. 315–323

  125. Xie S, Girshick R, Dollar P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 1492–1500

  126. Xu W, Matzner S (2018) Underwater fish detection using deep learning for water power applications. arXiv preprint arXiv: 1811.01494

    Book  Google Scholar 

  127. Yang Y, Dong J, Sun X, Lguensat R, Jian M, Wang X (2016) Ocean front detection from instant remote sensing SST images. IEEE Geosci Remote Sens Lett 13(12):1960–1964

    Article  Google Scholar 

  128. Yang Y, Dong J, Sun X, Lima E, Mu Q, Wang X (2017) A CFCC-LSTM model for sea surface temperature prediction. IEEE Geosci Remote Sens Lett 15(2):207–211

    Article  Google Scholar 

  129. Yang Z, Liu S, Hu H, Wang L, Lin S (2019) RepPoints: point set representation for object detection. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp. 9657–9666. https://doi.org/10.1109/ICCV.2019.00975

  130. Yu F, Koltun V, Funkhouser T (2017) Dilated residual networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 472–480

  131. Yuan X, Huang B, Wang Y, Yang C, Gui W (2018) Deep learning-based feature representation and its application for soft sensor modeling with variable-wise weighted SAE. IEEE Trans Ind Inf 14(7):3235–3243

    Article  Google Scholar 

  132. Yuh J, Marani G, Blidberg DR (2011) Applications of marine robotic vehicles. Intell Serv Robot 4(4):221–231

    Article  Google Scholar 

  133. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, pp. 818–833

  134. Zeiler MD, Taylor GW, Fergus R (2011) Adaptive deconvolutional networks for mid and high-level feature learning. In: 2011 International Conference on Computer Vision, pp. 2018-2025. https://doi.org/10.1109/ICCV.2011.6126474

  135. Zhang L, Lin L, Liang X, He K (2016) Is faster RCNN doing well for pedestrian detection? In: Proceedings of the European Conference on Computer Vision, Springer, pp. 443–447

  136. Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 6848–6856

  137. Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: a single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) 33:9259–9266

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shabir A. Parah.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fayaz, S., Parah, S.A. & Qureshi, G.J. Underwater object detection: architectures and algorithms – a comprehensive review. Multimed Tools Appl 81, 20871–20916 (2022). https://doi.org/10.1007/s11042-022-12502-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12502-1

Keywords

Navigation