Abstract
Semantic segmentation of remote sensing images based on deep convolutional neural networks has proven its effectiveness. However, due to the complexity of remote sensing images, deep convolutional neural networks have difficulties in segmenting objects with weak appearance coherences even though they can represent local features of object effectively. The road networks segmentation of remote sensing images faces two major problems: high inter-individual similarity and ubiquitous occlusion. In order to address these issues, this paper develops a novel method to extract roads from complex remote sensing images. We designed a Dual Dense Connected Attention network (DDCAttNet) that establishes long-range dependencies between road features. The architecture of the network is designed to incorporate both spatial attention and channel attention information into semantic segmentation for accurate road segmentation. Experimental results on the benchmark dataset demonstrate the superiority of our proposed approach both in quantitative and qualitative evaluation.
This research was supported in part by National Key Research and Development Plan Key Special Projects under Grant No. 2018YFB2100303, Shandong Province colleges and universities youth innovation technology plan innovation team project under Grant No. 2020KJN011, Shandong Provincial Natural Science Foundation under Grant No. ZR2020MF060, Program for Innovative Postdoctoral Talents in Shandong Province under Grant No. 40618030001, National Natural Science Foundation of China under Grant No. 61802216, and Postdoctoral Science Foundation of China under Grant No. 2018M642613.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Sun, S., Pang, J., Shi, J., Yi, S., Ouyang, W.: FishNet: a versatile backbone for image, region, and pixel level prediction. Adv. Neural. Inf. Process. Syst. 31, 754–764 (2018)
Zheng, P., Qi, Y., Zhou, Y., Chen, P., Zhan, J., Lyu, M.R.-T.: An automatic framework for detecting and characterizing performance degradation of software systems. IEEE Trans. Reliab. 63(4), 927–943 (2014)
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLAB: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Liu, Y., Yao, J., Lu, X., Xia, M., Wang, X., Liu, Y.: RoadNet: learning to comprehensively analyze road networks in complex urban scenes from high-resolution remotely sensed images. IEEE Trans. Geosci. Remote Sens. 57(4), 2043–2056 (2018)
Ding, H., Jiang, X., Shuai, B., Qun Liu, A., Wang, G.: Context contrasted feature and gated multi-scale aggregation for scene segmentation. In: Computer Vision and Pattern Recognition, pp. 2393–2402 (2018)
Zhang, H., et al.: Context encoding for semantic segmentation. In: Computer Vision and Pattern Recognition, pp. 7151–7160 (2018)
Lin, G., Milan, A., Shen, C., Reid, I.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: Computer Vision and Pattern Recognition, pp. 1925–1934 (2017)
Peng, C., Zhang, X., Yu, G., Luo, G., Sun, J.: Large kernel matters-improve semantic segmentation by global convolutional network. In: Computer Vision and Pattern Recognition, pp. 4353–4361 (2017)
Liu, Z., Li, X., Luo, P., Loy, C.-C., Tang, X.: Semantic image segmentation via deep parsing network. In: Computer Vision and Pattern Recognition, pp. 1377–1385 (2015)
Yu, F., Wang, D., Shelhamer, E., Darrell, T.: Deep layer aggregation. In: Computer Vision and Pattern Recognition, pp. 2403–2412 (2018)
Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: Computer Vision and Pattern Recognition, pp. 4438–4446 (2017)
Cai, Y., et al.: Guided attention network for object detection and counting on drones. arXiv preprint arXiv:1909.11307 (2019)
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: ECA-Net: efficient channel attention for deep convolutional neural networks. In: Computer Vision and Pattern Recognition, pp. 11534–11542 (2020)
Li, J., Xiu, J., Yang, Z., Liu, C.: Dual path attention net for remote sensing semantic image segmentation. ISPRS Int. J. Geo Inf. 9(10), 571 (2020)
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Zhang, Z., Lan, C., Zeng, W., Jin, X., Chen, Z.: Relation-aware global attention for person re-identification. In: Computer Vision and Pattern Recognition, pp. 3186–3195 (2020)
Wang, F., et al.: Residual attention network for image classification. In: Computer Vision and Pattern Recognition, pp. 3156–3164 (2017)
Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 29, 4898–4906 (2016)
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Ma, X., et al.: DCANet: learning connected attentions for convolutional neural networks. arXiv preprint arXiv:2007.05099 (2020)
Ungerleider, S.K.L.G.: Mechanisms of visual attention in the human cortex. Annual Rev. Neurosci. 23(1), 315–341 (2000)
Sharma, S., Ball, J.E., Tang, B., Carruth, D.W., Doude, M., Islam, M.A.: Semantic segmentation with transfer learning for off-road autonomous driving. Sensors 19(11), 2577 (2019)
Chen, G., et al.: Fully convolutional neural network with augmented Atrous spatial pyramid pool and fully connected fusion path for high resolution remote sensing image segmentation. Appl. Sci. 9(9), 1816 (2019)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Li, R., et al.: DeepuNet: a deep fully convolutional network for pixel-level sea-land segmentation. IEEE J. Sel. Top. Appl. Earth Obser. Remote Sens. 11(11), 3954–3962 (2018)
Li, Y., Xu, L., Rao, J., Guo, L., Yan, Z., Jin, S.: A Y-Net deep learning method for road segmentation using high-resolution visible remote sensing images. Remote Sens. Lett. 10(4), 381–390 (2019)
Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)
Shuai, B., Zuo, Z., Wang, B., Wang, G.: Scene segmentation with DAG-recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1480–1493 (2017)
Dong, R., Pan, X., Li, F.: DenseU-net-based semantic segmentation of small objects in urban remote sensing images. IEEE Access 7, 65347–65356 (2019)
Chen, K., et al.: Effective fusion of multi-modal data with group convolutions for semantic segmentation of aerial imagery. In: IEEE International Geoscience and Remote Sensing Symposium, pp. 3911–3914 (2019)
Vaswani, A., et al.: Attention is all you need. Adv. Neural. Inf. Process. Syst. 30, 5998–6008 (2017)
Anderson, P., et al.: Bottom-up and top-down attention for image captioning and visual question answering. In: Computer Vision and Pattern Recognition, pp. 6077–6086 (2018)
Fu, J., et al.: Dual attention network for scene segmentation. In: Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)
Kuen, J., Wang, Z., Wang, G.: Recurrent attentional networks for saliency detection. In: Computer Vision and Pattern Recognition, pp. 3668–3677 (2016)
Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363 (2019)
Hu, J., Shen, L., Albanie, S., Sun, G., Vedaldi, A.: Gather-excite: exploiting feature context in convolutional neural networks. Adv. Neural. Inf. Process. Syst. 31, 9401–9411 (2018)
Li, X., Hu, X., Yang, J.: Spatial group-wise enhance: Improving semantic feature learning in convolutional networks. arXiv preprint arXiv:1905.09646 (2019)
Gao, Z., Xie, J., Wang, Q., Li, P.: Global second-order pooling convolutional networks. In: Computer Vision and Pattern Recognition, pp. 3024–3033 (2019)
Cao, Y., Xu, J., Lin, S., Wei, F., Hu, H.: GCNET: non-local networks meet squeeze-excitation networks and beyond. In: Computer Vision and Pattern Recognition (2019)
Chen, Y., Kalantidis, Y., Li, J., Yan, S., Feng, J.: \(A^2\)-nets: double attention networks. Adv. Neural. Inf. Process. Syst. 31, 352–361 (2018)
Luong, M.-T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Ioannou, Y., Robertson, D., Cipolla, R., Criminisi, A.: Deep roots: improving CNN efficiency with hierarchical filter groups. In: Computer Vision and Pattern Recognition, pp. 1231–1240 (2017)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Badrinarayanan, V., Handa, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling. arXiv preprint arXiv:1505.07293 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Yuan, G., Li, J., Lv, Z., Li, Y., Xu, Z. (2021). DDCAttNet: Road Segmentation Network for Remote Sensing Images. In: Liu, Z., Wu, F., Das, S.K. (eds) Wireless Algorithms, Systems, and Applications. WASA 2021. Lecture Notes in Computer Science(), vol 12938. Springer, Cham. https://doi.org/10.1007/978-3-030-86130-8_36
Download citation
DOI: https://doi.org/10.1007/978-3-030-86130-8_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86129-2
Online ISBN: 978-3-030-86130-8
eBook Packages: Computer ScienceComputer Science (R0)