Abstract
In this paper, we present a novel method incorporating convolutional neural networks (CNN) into Markov random field (MRF) to automatically segment side scan sonar (SSS) images into object-highlight, object-shadow and sea-bottom reverberation areas. As a widely used ocean survey sensor, SSS provides high-resolution maps of the seafloor. Automatically segmenting SSS in real time can assist the navigation and path-planning of autonomous underwater vehicles. However, for the speckle noise and intensity inhomogeneity in the SSS images, it is difficult to find a robust SSS segmentation method. These facts motivate us to explore efficient CNN architectures to solve these problems. For pixel-level SSS segmentation, to use the context information and the details around a central pixel simultaneously, the CNN with multi-scale inputs (MSCNN) is employed. Besides, to mitigate the impact of the class imbalance problem, two MSCNN training strategies are introduced, which are based on data augmentation and ensemble learning. Furthermore, to take into account the local dependencies of class labels, the results of MSCNN are used to initialize MRF to get the final segmentation maps. Experimental results on real SSS images reveal that the proposed segmentation method outperforms MRF, CNN and semantic segmentation methods such as fully convolutional network and Segnet in segmentation accuracy and generalization performance. Moreover, the efficiency of the proposed method is proved on retinal image dataset.
Similar content being viewed by others
References
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., et al. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv:1603.04467.
Azzopardi, G., Strisciuglio, N., Vento, M., & Petkov, N. (2015). Trainable cosfire filters for vessel delineation with application to retinal images. Medical Image Analysis, 19(1), 46–57.
Badrinarayanan, V., Kendall, A., & Cipolla, R. (2017). Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481–2495.
Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society Series B (Methodological), 36(2), 192–236.
Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society Series B (Methodological), 48(3), 259–302.
Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2018). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.
Chollet, F., et al. (2015). Keras: Deep learning library for theano and tensorflow. https://keras.io/.
Dzieciuch, I., Gebhardt, D., Barngrover, C., & Parikh, K. (2016). Non-linear convolutional neural network for automatic detection of mine-like objects in sonar imagery. In International conference on applications in nonlinear dynamics (pp. 309–314). Springer.
Fallon, M. F., Kaess, M., Johannsson, H., & Leonard, J. J. (2011) Efficient AUV navigation fusing acoustic ranging and side-scan sonar. In 2011 IEEE international conference on robotics and automation (ICRA) (pp. 2398–2405). IEEE.
Havaei, M., Davy, A., Warde-Farley, D., Biard, A., Courville, A., Bengio, Y., et al. (2017). Brain tumor segmentation with deep neural networks. Medical Image Analysis, 35, 18–31.
Hoover, A., Kouznetsova, V., & Goldbaum, M. (2002). Locating blood vessels in retinal images by piece-wise threshold probing of a matched filter response. IEEE Transactions on Medical Imaging, 19(3), 203–210.
Huang, S. W., Chen, E., & Guo, J. (2017). Efficient seafloor classification and submarine cable route design using an autonomous underwater vehicle. IEEE Journal of Oceanic Engineering, 43(1), 7–18.
Joutsijoki, H., & Juhola, M. (2013). Kernel selection in multi-class support vector machines and its consequence to the number of ties in majority voting method. Artificial Intelligence Review, 40(3), 213–230.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in neural information processing systems (pp. 1097–1105). Cambridge, MA: MIT Press.
Kumar, N., Tan, Q. F., & Narayanan, S. S. (2012). Object classification in sidescan sonar images with sparse representation techniques. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 1333–1336). IEEE.
Lafferty, J. D., Mccallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In C. E. Brodley & A. P. Danyluk (Eds.), Proceedings of the Eighteenth International Conference on Machine Learning (pp. 282–289). San Francisco, CA: Morgan Kaufmann.
Längkvist, M., Kiselev, A., Alirezaie, M., & Loutfi, A. (2016). Classification and segmentation of satellite orthoimagery using convolutional neural networks. Remote Sensing, 8(4), 329.
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Li, Q., Feng, B., Xie, L., Liang, P., Zhang, H., & Wang, T. (2015). A cross-modality learning approach for vessel segmentation in retinal images. IEEE Transactions on Medical Imaging, 35(1), 109–118.
Li, S. Z. (1994). Markov random field models in computer vision. In European conference on computer vision (pp. 361–370). Springer.
Lianantonakis, M., & Petillot, Y. R. (2005). Sidescan sonar segmentation using active contours and level set methods. In Oceans 2005-Europe (Vol. 1, pp. 719–724). IEEE.
Lin, G., Milan, A., Shen, C., & Reid, I. (2017). Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. In IEEE conference on computer vision and pattern recognition (CVPR).
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431–3440).
Mignotte, M., Collet, C., Pérez, P., & Bouthemy, P. (1999). Three-class Markovian segmentation of high-resolution sonar images. Computer Vision and Image Understanding, 76(3), 191–204.
Mignotte, M., Collet, C., Perez, P., & Bouthemy, P. (2000). Sonar image segmentation using an unsupervised hierarchical mrf model. IEEE Transactions on Image Processing, 9(7), 1216–1231.
Muhammad Moazam, F., Paolo, R., Andreas, H., Bunyarit, U., Rudnicka, A. R., Owen, C. G., et al. (2012). An ensemble classification-based approach applied to retinal blood vessel segmentation. IEEE Transactions on Biomedical Engineering, 59(9), 2538–2548.
Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 807–814).
Paull, L., Saeedi, S., Li, H., & Myers, V. (2010). An information gain based adaptive path planning method for an autonomous underwater vehicle using sidescan sonar. In 2010 IEEE conference on automation science and engineering (CASE) (pp. 835–840). IEEE.
Pécot, T., Bouthemy, P., Boulanger, J., Chessel, A., Bardin, S., Salamero, J., et al. (2015). Background fluorescence estimation and vesicle segmentation in live cell imaging with conditional random fields. IEEE Transactions on Image Processing, 24(2), 667–680.
Sawas, J., Petillot, Y., & Pailhas, Y. (2010). Cascade of boosted classifiers for rapid detection of underwater objects. In Proceedings of the European conference on underwater acoustics.
Song, Z., Zielinski, A., & Bian, H. (2015). Underwater navigation method based on side-scan sonar images. Canadian Acoustics, 43(3), 136–137.
Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1), 1929–1958.
Uzunbas, M. G., Chen, C., & Metaxas, D. (2016). An efficient conditional random field approach for automatic and interactive neuron segmentation. Medical Image Analysis, 27, 31–44.
Valdenegro-Toro, M. (2016). Objectness scoring and detection proposals in forward-looking sonar images with convolutional neural networks. In IAPR workshop on artificial neural networks in pattern recognition (pp. 209–219). Springer.
Wang, L., Ye, X. F., Wang, G., & Wang, L. (2017). A fast hierarchical MRF sonar image segmentation algorithm. International Journal of Robotics and Automation, 32(1), 48–54.
Wei, J., & Li, Z. N. (1999). An efficient two-pass map-mrf algorithm for motion estimation based on mean field theory. IEEE Transactions on Circuits and Systems for Video Technology, 9(6), 960–972.
Williams, D. P. (2016). Underwater target classification in synthetic aperture sonar imagery using deep convolutional neural networks. In 2016 23rd international conference on pattern recognition (ICPR) (pp. 2497–2502). IEEE.
Ye, X. F., Zhang, Z. H., Liu, P. X., & Guan, H. L. (2010). Sonar image segmentation based on gmrf and level-set models. Ocean Engineering, 37(10), 891–901.
Zhang, Y., Brady, M., & Smith, S. (2001). Segmentation of brain mr images through a hidden Markov random field model and the expectation–maximization algorithm. IEEE Transactions on Medical Imaging, 20(1), 45–57.
Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017). Pyramid scene parsing network. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2881–2890).
Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., et al. (2015). Conditional random fields as recurrent neural networks. In Proceedings of the IEEE international conference on computer vision (pp. 1529–1537).
Acknowledgements
This work has been supported by the Fundamental Research Funds for the Central Universities (62420078614132).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, P., Song, Y. Segmentation of sonar imagery using convolutional neural networks and Markov random field. Multidim Syst Sign Process 31, 21–47 (2020). https://doi.org/10.1007/s11045-019-00652-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11045-019-00652-9