Abstract
Nowadays, high-frequency forward-looking sonar is an effective device to obtain the main information of underwater objects. Detection and segmentation of underwater objects are also one of the key topics of current research. Deep learning has shown excellent performance in image features extracting and has been extensively used in image object detection and instance segmentation. With the network depth increasing, training accuracy gets saturated and training parameters also increase rapidly. In this paper, a series of residual blocks are used to build a 32-layer feature extraction network and take place of the Resnet50/101 in Mask RCNN, which reduces the training parameters of the network while guaranteeing the detection performance. The parameters of the proposed network are 29% less than Resnet50 and 50.2% less than Resnet101, which is of great significance for future hardware implementation. In addition, Adagrad optimizer is introduced into this research to improve the detection performance of sonar images. Finally, the object detection results of 500 test sonar images show that the mAP is 96.97% that is only 0.18% less than Resnet50 (97.15%) but more than Resnet101 (95.15%).









Similar content being viewed by others
References
Abu, A., Diamant, R.: A statistically-based method for the detection of underwater objects in sonar imagery. IEEE Sens. J. 19(16), 6858–6871 (2019)
Cho, H., Pyo, J., Gu, J., Jeo, H., Yu, S.C.: Experimental results of rapid underwater object search based on forward-looking imaging sonar. In: 2015 IEEE Underwater Technology (UT), pp. 1–5. IEEE (2015)
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Klausner, H.N., Azimi-Sadjadi, M.R.: Performance prediction and estimation for underwater target detection using multichannel sonar. IEEE J. Ocean. Eng. 2019, 1–13 (2019)
Kong, W., Hong, J., Jia, M., Yao, J., Cong, W., Hu, H., Zhang, H.: Yolov3-dpfin: a dual-path feature fusion neural network for robust real-time sonar target detection. IEEE Sens. J. 20(7), 3745–3756 (2020)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., Sun, J.: Detnet: a backbone network fore object detection. arXiv preprint arXiv:1804.06215 (2018)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: European Conference on Computer Vision, pp. 740–755. Springer (2014)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Qin, Z., Li, Z., Zhang, Z., Bao, Y., Yu, G., Peng, Y., Sun, J.: Thundernet: towards real-time generic object detection on mobile devices. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6718–6727 (2019)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Shi, T., Liu, M., Niu, Y., Yang, Y., Huang, Y.: Underwater targets detection and classification in complex scenes based on an improved yolov3 algorithm. J. Electron. Imaging 29(4), 1 (2020)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Song, Y., He, B., Liu, P.: Real-time object detection for auvs using self-cascaded convolutional neural networks. IEEE J. Ocean. Eng. PP(99), 1–12 (2019)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Tang, C., Zhang, G., Hu, H., Wei, P., Duan, Z., Qian, Y.: An improved YOLOv3 algorithm to detect molting in swimming crabs against a complex background. Aquac. Eng. 91, 102115 (2020)
Valdenegro-Toro, M.: Learning objectness from sonar images for class-independent object detection. arXiv preprint arXiv:1907.00734 (2019)
Yang, H., Liu, P., Hu, Y.Z., Fu, J.N.: Research on underwater object recognition based on yolov3. Microsystem Technologies, pp. 1–8 (2020)
Yang, Q., Xiao, D., Lin, S.: Feeding behavior recognition for group-housed pigs with the faster r-cnn. Comput. Electron. Agric. 155, 453–460 (2018)
Yu, Y., Zhang, K., Yang, L., Zhang, D.: Fruit detection for strawberry harvesting robot in non-structural environment based on mask-rcnn. Comput. Electron. Agric. 163, 104846 (2019)
Zeng, W.J., Wan, L., Zhang, T.D., Huang, S.I.: Simultaneous localization and mapping of autonomous underwater vehicle using looking forward sonar. J. Shanghai Jiaotong Univ. (Science) 17(1), 91–97 (2012)
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)
Acknowledgements
This work is supported by the National Natural Science Foundation of China (Grant No. 6150010825) and the project of Jiangsu Province’s six talent peak funding: deep sea ROV obstacle avoidance sonar (No. KTHY-026).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fan, Z., Xia, W., Liu, X. et al. Detection and segmentation of underwater objects from forward-looking sonar based on a modified Mask RCNN. SIViP 15, 1135–1143 (2021). https://doi.org/10.1007/s11760-020-01841-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-020-01841-x