Abstract
Underwater object detection is an essential step in image processing and it plays a vital role in several applications such as the repair and maintenance of sub-aquatic structures and marine sciences. Many computer vision-based solutions have been proposed but an optimal solution for underwater object detection and species classification does not exist. This is mainly because of the challenges presented by the underwater environment which mainly include light scattering and light absorption. The advent of deep learning has enabled researchers to solve various problems like protection of the subaquatic ecological environment, emergency rescue, reducing chances of underwater disaster and its prevention, underwater target detection, spooring, and recognition. However, the advantages and shortcomings of these deep learning algorithms are still unclear. Thus, to give a clearer view of the underwater object detection algorithms and their pros and cons, we proffer a state-of-the-art review of different computer vision-based approaches that have been developed as yet. Besides, a comparison of various state-of-the-art schemes is made based on various objective indices and future research directions in the field of underwater object detection have also been proffered.
Similar content being viewed by others
References
Abdel-Maksoud E, Elmogy M, Al-Awadi R (2015) Brain tumour segmentation based on a hybrid clustering technique. Egypt Inform J Cairo Univ 16(1):71–81
About ImageNet (n.d.) http://image-net.org/about-overview. Accessed 8/7/21
About ImageNet (n.d.) http://image-net.org/about-overview. Accessed 8/7/21
Alberto GG, Sergio OE, Sergiu O, Victor VM, Jose GR (2017) A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857
Amanda D, Felipe C, Joel G, Silvia B (2016) A dataset to evaluate underwater image restoration methods. In: IEEE OCEANS 2016-Shanghai, pp. 1–6, https://doi.org/10.1109/OCEANSAP.2016.7485524
Aquino G, Rubio JDJ, Pacheco J, Gutierrez GJ, Ochoa G, Balcazar R, Cruz DR, Garcia E, Novoa JF, Zacarias A (2020) Novel nonlinear hypothesis for the delta parallel robot modelling. IEEE Access 8:46324–46334. https://doi.org/10.1109/ACCESS.2020.2979141
BBC News (2016) Artificial intelligence: Google’s AlphaGo beats Go master Lee Se-dol
Beijbom O, Edmunds PJ, Kline DI, Mitchell BG, Kriegman D (2012) Automated annotation of coral reef survey images. In: IEEE computer society conference on computer vision and pattern recognition, IEEE Press: Rhode Island, pp. 1170–1177
Bell RM, Koren Y (2007) Lessons from the netflix prize challenge. ACM Sigkdd Explor Newslett 9(2):75–79
Biswas S, Wang Y, Cui S (2015) Surgically altered face detection using log-gabor wavelet. In: Proceedings of the 12th international computer conference on wavelet active media technology and information processing (ICCWAMTIP), IEEE, Chengdu, China, pp. 154–157
Bochkovskiy A, Wang C-Y, Liao M et al (2020) YOLOv4: Optimal Speed and accuracy of object detection. arXiv:2004.10934v1
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2015) Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: ICLR
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell (TPAMI) 40(4):834–848
Chen Y, Yang T, Zhang X, Meng G, Xiao X, Sun J (2019) DetNAS: backbone search for object detection. In: Advances in neural information processing systems (NeurIPS), pp. 6638–6648
Chen L, Liu Z, Tong L, et al. (2020) Underwater object detection using invert multi-class adaboost with deep learning. IEEE explore, Auckland University of Technology
Chiang H-S, Chen M-Y, Huang Y-J (2019) Wavelet-based EEG processing for epilepsy detection using fuzzy entropy and associative petri net. IEEE Access 7:103255–103262. https://doi.org/10.1109/ACCESS.2019.2929266
Chien C-F, Chen Y-J, Han Y-T, et al. (2018) AI and big data analytics for wafer fab energy saving and chiller optimization to empower intelligent manufacturing. In: Proceedings of 2018 e-Manufacturing & Design Collaboration Symposium (eMDC), IEEE, Hsinchu, Taiwan, pp. 1–4
Ciregan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. 2012 IEEE conference on computer vision and pattern recognition, Providence, RI, pp. 3642–3649. https://doi.org/10.1109/CVPR.2012.6248110
Cui S, Ekwonnah O, Wang Y (2018) The analysis of emotions over keystroke dynamics. In: 2018 international conference on information, electronic and communication engineering (IECE2018), DEStech Publications, Beijing, China, pp. 28–29
Cui S, Wang Y, Ekwonnah O (2019) Keystroke dynamics on user authentication. In: 4th international conference on cybernetics (CYBCONF), Beijing, China, pp. 5–7
Cui S, Zhou Y, Wang Y, Zhai L (2019) Fish detection using deep learning. Appl Comput Intell Soft Comput 13:Article ID 3738108. https://doi.org/10.1155/2020/3738108
Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems (NIPS), pp. 379–387
Deng L, Yu D (2013) Deep learning: methods and applications. Found Trends Signal Process 7(3–4):197–387. https://doi.org/10.1561/2000000039
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 248–255. https://doi.org/10.1109/CVPR.2009.5206848
Drews Jr P, Edo N, Moraes F, Botelho S, Campos M (2013) Transmission estimation in underwater single images. In: 2013 IEEE International Conference on Computer Vision Workshops. https://doi.org/10.1109/ICCVW.2013.113
Drews P Jr, Nascimento ER, Botelho S, Campos M (2016) Underwater depth estimation and image restoration based on single images. IEEE Comput Graph Appl 36:24–35. https://doi.org/10.1109/MCG.2016.26
Du X, Lin T-Y, Jin P, Ghiasi G, Tan M, Cui Y et al (2019) SpineNet: learning scale-permuted backbone for recognition and localization. arXiv preprint arXiv:1912.05027
Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) CenterNet: Keypoint triplets for object detection. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp. 6569–6578. https://doi.org/10.1109/ICCV.2019.00667
Ducournau A, Fablet R (2016) Deep learning for ocean remote sensing: An application of convolutional neural networks for super-resolution on satellite-derived SST data. In: 9th workshop on pattern recognition in remote sensing, pp. 1-6. https://doi.org/10.1109/PRRS.2016.7867019
Duo Z, Wang W, Wang H (2019) Oceanic mesoscale eddy detection method based on deep learning. Remote Sens 11(16):1921
Elawady M (2014) Sparsem: coral classification using deep convolutional neural networks. M.sc. thesis, Hariot-Watt University
Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using deep neural networks. In: Computer Vision and Pattern Recognition (CVPR), 2014 IEEE conference, pp. 2155–2162. https://doi.org/10.1109/CVPR.2014.276
Federico F, Elsa R, Humberto S, Víctor P (2020) CNN based detectors on planetary environments: a performance evaluation. Front Neurorobot 14:590371(ISSN 1662-5218. https://doi.org/10.3389/fnbot.2020.590371
Fu CY, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659
Garcia J, Fernandez J, Sanz P, Marin R (2010) Increasing autonomy within underwater intervention scenarios: the user interface approach. In: Proceedings of the IEEE systems conference, pp. 71-75
Ghani ASA, Isa NAM (2015) Enhancement of low-quality underwater image through integrated global and local contrast correction. Appl Soft Comput 37(C):332–344. https://doi.org/10.1016/j.asoc.2015.08.033
Ghiasi G, Lin T-Y, Le QV (2019) Learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7036–7045
Girshick R (2015) Fast R-CNN. In: 2015 IEEE international conference on computer vision (ICCV), IEEE, Santiago, Chile, pp. 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 580–587, https://doi.org/10.1109/CVPR.2014.81.
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on computer Vision & Pattern Recognition, IEEE, Columbus, OH, USA
Guo J, Han K, Wang Y, Zhang C, Yang Z, Wu H, Chen X, Xu C (2020) HitDetector: hierarchical trinity architecture search for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)
Han F, Yao J, Zhu H, Wang C (2020) Underwater image processing and object detection based on deep CNN method. J Sensors 20:Article ID 6707328. https://doi.org/10.1155/2020/6707328
Han K, Wang Y, Tian Q, et al. (2020) GhostNet: more features from cheap operations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR42600.2020.00165
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell (TPAMI) 37(9):1904–1916
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on Imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034. https://doi.org/10.1109/ICCV.2015.123
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 770–778. https://doi.org/10.1109/CVPR.2016.90
He K, Gkioxari G, Dollar P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2961–2969
Hinton GE, Osindero S (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Hoang Ngan Le T, Zheng Y, Zhu C, Luu K, Savvides M (2016) Mulitple scale Faster R-CNN approach to diver’s cell-phone usage and hands on steering wheel detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 46–53
Hoiem D, Chodpathumwan Y, Dai Q (2012) Diagnosing error in object detectors. In; Computer Vision–ECCV, Springer, pp. 340–353
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T et al (2017) MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Howard A, Sandler M, Chu G, Chen L-C, Chen B, Tan M et al (2019) Searching for MobileNetV3. In: Proceedings of the IEEE international conference on computer vision (ICCV). https://arxiv.org/abs/1905.02244. Accessed 10/7/21
Hu M, Yang Y, Shen F, Zhang L, Shen H, Li X (2017) Robust web image annotation via exploring multi-facet and structural knowledge. IEEE Trans Image Process 26(10):4871–4884
Huang G, Liu Z, Maaten LVD, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 4700–4708, arXiv:1608.06993
Huang H, Zhou H, Yang X, Zhang L, Qi L, Zang A-Y (2019) Faster RCNN for marine organisms detection and recognition using data augmentation. Neurocomputing 337:372–384
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167
Jia J (2009) A machine vision application for industrial assembly inspection. In: Proceedings of 2009 second international conference on machine vision, IEEE, Dubai, UAE, pp. 172–176
Knausgard KM, Wiklund A, Sørdalen TK et al (2020) Temperate fish detection and classification: a deep learning based approach. Appl Intell Manuscr. https://doi.org/10.1007/s10489-020-02154-9
Koreitem K, Girdhar Y, Cho W, Singh H, Pineda J, Dudek G (2016) Subsea fauna enumeration using vision-based marine robots. In: Proceedings of the IEEE conference on computer and robot vision, pp. 101-108
Krasin I, Duerig T, Alldrin N, et al. (2017) Openimages: A public dataset for large-scale multi-label and multi-class image classification. Dataset available from https://github.com/openimages. Accessed 10/7/21
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, vol.1 (NIPS), Lake Tahoe, Nevada, 25(2):1097–1105. https://doi.org/10.5555/2999134.2999257
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Landola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and 0.5 MB model size. arXiv preprint arXiv:1602.07360
Lavery PS, McMahon K, Weyers J, Boyce MC, Oldham CE (2013) Release of dissolved organic carbon from seagrass wrack and its implications for trophic connectivity. Mar Ecol Prog Ser 494:121–133
Law H, Deng J (2018) “CornerNet: Detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 734–750
Law H, Teng Y, Russakovsky O, Deng J (2019) CornerNet-lite: efficient keypoint based object detection. arXiv preprint arXiv:1904.08900
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Li Z, Peng C, Yu G, Zhang X, Deng Y, Sun J (2018) DetNet: design backbone for object detection. In: Proceedings of the European conference on computer vision (ECCV), pp. 334–350
Lin T-Y, Maire M, Belongie S, et al. (2014) Microsoft COCO: common objects in context. In: Proceedings of the European conference on computer vision (ECCV), pp. 740–755
Lin T-Y, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp. 2980–2988. https://doi.org/10.1109/ICCV.2017.324
Lin T-Y, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 2117–2125. https://doi.org/10.1109/CVPR.2017.106
Liu M (n.d.) AU-aware deep networks for facial expression recognition. In: 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, IEEE Press, Shanghai, pp. 1–6. https://doi.org/10.1109/FG.2013.6553734
Liu Z, Zhang Y, Yu X, Yuan C (2016) Unmanned surface vehicles: An overview of development and challenges. Annu Rev Control 41:71–93. https://doi.org/10.1016/j.arcontrol.2016.04.018
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: Proceedings of the European conference on computer vision (ECCV), pp. 21–37
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 8759–8768. https://doi.org/10.1109/CVPR.2018.00913
Liu S, Huang D, et al. (2018) Receptive field block net for accurate and fast object detection. In: Proceedings of the European conference on computer vision (ECCV), pp. 385–400
Liu S, Li X, Gao M, Cai Y, Nian R, Li P, et al. (2018) Embedded online fish detection and tracking system via yolo-v3 and parallel correlation filter. In: OCEANS 2018 MTS/IEEE Charleston. IEEE, pp 1–6
Liu S, Huang D, Wang Y (2019) Learning spatial fusion for single-shot object detection. arXiv preprint arXiv:1911.09516
Lu H, Li Y, Zhang Y, Chen M, Serikawa S, Kim H (2017) Underwater optical image processing: a comprehensive review. Mobile Netw Appl 22:1202–1211. https://doi.org/10.1007/s11036-017-0863-4
Luo Y, Wan Y (2013) A novel efficient method for training sparse auto-encoders. In: 2103 6th international congress on image and signal processing, pp. 1019-1023
Ma N, Zhang X, Zheng H-T, Sun J (2018) ShuffleNetV2: practical guidelines for efficient cnn architecture design. In: Proceedings of the European conference on computer vision (ECCV), pp. 116–131. https://arxiv.org/abs/1807.11164. Accessed 10/7/21
Mahmood A, Bennamoun M, An S, Sohel F, Boussaid F, Hovey R, Kendrick G, Fisher RB (2016) Coral classification with hybrid feature representations. In: IEEE International Conference on Image Processing, IEEE Press, Arizona, pp. 519–523
Marr B (n.d.) Key milestones of Waymo – Google’s self-driving cars. https://www.forbes.com/sites/bernardmarr/2018/09/21/keymilestones-of-waymo-googles-self-driving-cars/#3831b2965369. Accessed 18/12/2021
Meda-Campaña JA (2018) On the estimation and control of nonlinear systems with parametric uncertainties and noisy outputs. IEEE Access 6:31968–31973. https://doi.org/10.1109/ACCESS.2018.2846483
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp. 807–814
Online Available at (n.d.) “https://www.pureadvantage.org/news/2016/11/15/underwater-robots”. Accessed 18/12/2021
Online Available at (n.d.) “https://www.ecagroup.com/en/solutions/h800-rov-remotely-operated-vehicle”. Accessed 18/12/2021
Pang J, Chen K, Shi J, Feng H, Ouyang W, Libra LD (2019) R-CNN: towards balanced learning for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp., 821–830
Qin H, Li X, Yang Z, Shang M (2015) When underwater imagery analysis meets deep learning: a solution at the age of big visual data. In: OCEANS 2015 - MTS/IEEE Washington, IEEE Press, Washington DC, pp. 1–5. https://doi.org/10.23919/OCEANS.2015.7404463
Rashwan A, Kalra A, Poupart P (2019) Matrix nets: a new deep architecture for object detection. In: Proceedings of the IEEE international conference on computer vision workshop (ICCV workshop), pp. 2025–2028. https://doi.org/10.1109/ICCVW.2019.00252
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7263–7271. https://doi.org/10.1109/CVPR.2017.690
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Computer vision and pattern recognition (CVPR). 2017 IEEE conference, pp. 6517–6525. https://doi.org/10.1109/CVPR.2017.690
Redmon J, Farhadi A (2018) YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788. https://doi.org/10.1109/CVPR.2016.91
Ren S, He K, Girshick R, Sun J (2015) Faster RCNN: towards real-time object detection with region proposal networks. In: Proceedings of the advances in neural information processing systems, pp. 91-99. https://doi.org/10.1109/TPAMI.2016.2577031
Rifai S, Vincent P, Muller X, Glorot X, Bengio Y (2011) Contractive autoencoders: explicit invariance during feature extraction. In: Proceedings of the 28th international conference on international conference on machine learning, pp. 833–840
Rubio J (2009) SOFMLS: online self-organizing fuzzy modified least-squares network. IEEE Trans Fuzzy Syst 17(6):1296–1309. https://doi.org/10.1109/TFUZZ.2009.2029569
Rubio J (n.d.) Stability analysis of the modified Levenberg-Marquardt algorithm for the artificial neural network training. In: IEEE Transactions on Neural Networks and Learning Systems, https://doi.org/10.1109/TNNLS.2020.3015200
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV) 115:211–252. https://doi.org/10.1007/s11263-015-0816-y
Sa I, Ge Z, Dayoub F, Upcroft B, Perez T, McCool C (2016) Deepfruits: a fruit detection system using deep neural networks. Sensors 16(8)
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedingsof the IEEE conference on computer vision and pattern recognition (CVPR), pp. 4510–4520
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2014) Overfeat: integrated recognition, localization and detection using convolutional networks. Preprint arXiv:1312.6229
Sharma P (2018) A step-by-step introduction to the basic object detection algorithms part 1. Analytics Vidhya, Introduction
Sharma A, Singh PK, Khurana P (2016) Analytical review on object segmentation and recognition. In: 2016 6th international conference – cloud system and big data engineering (confluence), pp. 524-530, https://doi.org/10.1109/CONFLUENCE.2016.7508176
Simonyan K, Zisserman A (n.d.) Very deep convolutional networks for large scale image recognition. In: IEEE International Conference on Learning Representations, pp. 730–734. https://arxiv.org/abs/1409.1556. Accessed 18/12/2021
Song HA, Lee S-Y (2013) Hierarchical representation using NMF. In: Lee M, Hirose A, Hou Z-G, Kil RM (eds) ICONIP 2013. LNCS, 8226:466–473. Springer, Heidelberg. https://doi.org/10.1007/978-3-642-42054-2_58
Sun F, Yu J, Chen S, Xu D (2014) Active visual tracking of free-swimming robotic fish based on automatic recognition. In: Proceedings of the IEEE conference on intelligent control and automation, pp. 2879-2884
Sung M, Yu S-C, Girdhar Y (2017) Vision based real-time fish detection using convolutional neural network. In: OCEANS 2017 – Aberdeen. IEEE, pp. 1-6, https://doi.org/10.1109/OCEANSE.2017.8084889
Swart S, Zietsman J, Coetzee J, Goslett D, Hoek A, Needham D, Monteiro P (2016) Ocean robotics in support of fisheries research and management. Afr J Mar Sci 38(4):525–538
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, et al. (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9. https://doi.org/10.1109/CVPR.2015.7298594
Szegedy C, Ioffe S, Vanhoucke V (2016) Inception-v4, inception-resnet and the impact of residual connections on learning. CoRR, abs/1602.07261
Takagi M, Mori H, Yimit A, Hagihara Y, Miyoshi T (2016) Development of a small size underwater robot for observing fisheries resources-underwater robot for assisting abalone fishing. J Robot Mechatron 28(3):397–403
Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. In: Proceedings of international conference on machine learning (ICML)
Tan M, Le QV (2019) MixNet: mixed depthwise convolutional kernels. In proceedings of the British machine vision conference (BMVC)
Tan M, Chen B, Pang R, Vasudevan V, Sandler M, Howard A, Le, QV (2019) MNASnet: Platform-aware neural architecture search for mobile. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2820–2828
Tan M, Pang R, Le QV (2020) EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR42600.2020.01079
Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp. 9627–9636. https://doi.org/10.1109/ICCV.2019.00972
Villon S, Chaumont M, Subsol G, Villéger S, Claverie T, Mouillot D (2016) Coral reef fish detection and recognition in underwater videos by supervised machine learning: comparison between deep learning and HOG+SVM methods. ACIVS: advanced concepts for intelligent vision systems, Lecce, Italy. ffhal-01374123
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103
Wang Y, Lan Y, Zheng Y, Lee K, Cui S, Lian J (2013) A UGV based laser scanner system for measuring tree geometric characteristics. In: Proceedings of 2013 SPIE International Symposium on Photoelectronic Detection and Imaging, SPIE, Beijing China, vol. 8905
Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) CSPNet: a new backbone that can enhance learning capability of cnn. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPR Workshop). https://doi.org/10.1109/CVPRW50498.2020.00203
Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp. 3–19
Xavier G, Antoine B, Yoshua B (2011) Deep sparse rectifier neural networks. In; Proceedings of the 14th international conference on artificial intelligence and statistics, pp. 315–323
Xie S, Girshick R, Dollar P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 1492–1500
Xu W, Matzner S (2018) Underwater fish detection using deep learning for water power applications. arXiv preprint arXiv: 1811.01494
Yang Y, Dong J, Sun X, Lguensat R, Jian M, Wang X (2016) Ocean front detection from instant remote sensing SST images. IEEE Geosci Remote Sens Lett 13(12):1960–1964
Yang Y, Dong J, Sun X, Lima E, Mu Q, Wang X (2017) A CFCC-LSTM model for sea surface temperature prediction. IEEE Geosci Remote Sens Lett 15(2):207–211
Yang Z, Liu S, Hu H, Wang L, Lin S (2019) RepPoints: point set representation for object detection. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp. 9657–9666. https://doi.org/10.1109/ICCV.2019.00975
Yu F, Koltun V, Funkhouser T (2017) Dilated residual networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 472–480
Yuan X, Huang B, Wang Y, Yang C, Gui W (2018) Deep learning-based feature representation and its application for soft sensor modeling with variable-wise weighted SAE. IEEE Trans Ind Inf 14(7):3235–3243
Yuh J, Marani G, Blidberg DR (2011) Applications of marine robotic vehicles. Intell Serv Robot 4(4):221–231
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, pp. 818–833
Zeiler MD, Taylor GW, Fergus R (2011) Adaptive deconvolutional networks for mid and high-level feature learning. In: 2011 International Conference on Computer Vision, pp. 2018-2025. https://doi.org/10.1109/ICCV.2011.6126474
Zhang L, Lin L, Liang X, He K (2016) Is faster RCNN doing well for pedestrian detection? In: Proceedings of the European Conference on Computer Vision, Springer, pp. 443–447
Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 6848–6856
Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: a single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) 33:9259–9266
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fayaz, S., Parah, S.A. & Qureshi, G.J. Underwater object detection: architectures and algorithms – a comprehensive review. Multimed Tools Appl 81, 20871–20916 (2022). https://doi.org/10.1007/s11042-022-12502-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12502-1