Abstract
Foremost deep neural network models trained in natural scenes cannot transfer and apply to remote sensing image semantic segmentation well. Studies have shown that fine-tuning methods containing model fusion can alleviate this dilemma. In this paper, we provide an approach used to improve U-Net and propose an end-to-end deep convolutional neural network (DCNN) combining the superiorities of DenseNet, U-Net, dilated convolution, and DeconvNet. We evaluated the proposed method and model on the Potsdam orthophoto data set. Compared with U-Net, our approach increases the PA, mPA, and mIoU evaluation indexes by 11.1%, 14.0%, and 13.5%, respectively; the segmentation speed increases by approximately 1.18 times and the number of parameters is 59.0% that of U-Net. The experiments demonstrate that for the semantic segmentation of high-resolution remote sensing images, using the combined dilated convolutions as the primary feature extractor, using the transposed convolution to restore the size of the feature maps, and reducing the number of layers is an effective method to improve the comprehensive performance of U-Net. This research enriches the models based on DCNNs and the modes of using DCNNs in a specific scene.
Similar content being viewed by others
Code Availability
Software applications or custom code generated or used during the study are available from the corresponding author by request.
References
Ma L, Liu Y, Zhang X, Ye Y, Yin G, Johnson BA (2019) Deep learning in remote sensing applications:A meta-analysis and review. ISPRS J Photogramm Remote Sens 152:166– 177
Zhang J, Lu C, Li X, Kim H -J, Wang J (2019) A full convolutional network based on DenseNet for remote sensing scene classification. Math Biosci Eng 16(5):3345– 3367
Guo Y, Liu Y, Georgiou T, Lew MS (2018) A review of semantic segmentation using deep neural networks. Int J Multimed Inf Retr 7(2):87–93
Yi Y, Zhang Z, Zhang W, Zhang C, Li W, Zhao T (2019) Semantic segmentation of urban buildings from vhr remote sensing imagery using a deep convolutional neural network. Remote Sens 11 (15):1774
Zhang L, Zhang L, Du B (2016) Deep learning for remote sensing data:A technical tutorial on the state of the art. IEEE Geosci Remote Sens Mag 4(2):22–40
Audebert N, Le Saux B, Lefèvre S (2017) Segment-before-detect:Vehicle detection and classification through semantic segmentation of aerial images. Remote Sens 9(4):368
Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1520– 1528
Guo M, Liu H, Xu Y, Huang Y (2020) Building Extraction Based on U-Net with an Attention Block and Multiple Losses. Remote Sens 12(9):1400
Diakogiannis FI, Waldner F, Caccetta P, Wu C (2020) Resunet-a:a deep learning framework for semantic segmentation of remotely sensed data. ISPRS J Photogramm Remote Sens 162:94– 114
Xu Z, Zhang W, Zhang T, Li J (2021) HRCNet:High-resolution context extraction network for semantic segmentation of remote sensing images. Remote Sens 13(1):71
Chen B, Xia M, Huang J (2021) MFANet:a multi-level feature aggregation network for semantic segmentation of land cover. Remote Sens 13(4):731
Ouyang S, Li Y (2021) Combining deep semantic segmentation network and graph convolutional neural network for semantic segmentation of remote sensing imagery. Remote Sens 13(1): 119
Li X, Chen H, Qi X, Dou Q, Fu C-W, Heng P-A (2018) H-DenseUNet:hybrid densely connected UNet for liver and tumor segmentation from CT volumes. IEEE Trans Med Imaging 37(12):2663–2674
Ibtehaz N, Rahman MS (2020) MultiResUNet:Rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural Netw 121:74–87
Alom MZ, Yakopcic C, Taha TM, Asari VK (2018) Nuclei segmentation with recurrent residual convolutional neural networks based U-Net (R2U-Net). In: NAECON 2018-IEEE national aerospace and electronics conference. IEEE, pp 228– 233
Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2018) Unet++:A nested u-net architecture for medical image segmentation. In: Deep Learning in medical image analysis and multimodal learning for clinical decision support, pp 3–11
Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B (2018) Attention u-net:Learning where to look for the pancreas. arXiv:180403999
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Yu H, Yang Z, Tan L, Wang Y, Sun W, Sun M, Tang Y (2018) Methods and datasets on semantic segmentation:A review. Neurocomputing 304:82–103
Garcia-Garcia A, Orts-Escolano S, Oprea S, Villena-Martinez V, Garcia-Rodriguez J (2017) A review on deep learning techniques applied to semantic segmentation. arXiv:170406857
Ding H, Jiang X, Shuai B, Liu AQ, Wang G (2020) Semantic segmentation with context encoding and multi-path decoding. IEEE Trans Image Process. 29:3520–3533
Zhou L, Zhang C, Wu M (2018) D-Linknet:linknet with pretrained encoder and dilated convolution for high resolution satellite imagery road extraction. In: CVPR workshops, pp 182– 186
Shang R, Zhang J, Jiao L, Li Y, Marturi N, Stolkin R (2020) Multi-scale adaptive feature fusion network for semantic segmentation in remote sensing images. Remote Sens 12(5):872
Zhu H, Wang B, Zhang X, Liu J (2020) Semantic image segmentation with shared decomposition convolution and boundary reinforcement structure. Appl Intell, pp 1–14
Lateef F, Ruichek Y (2019) Survey on semantic segmentation using deep learning techniques. Neurocomputing 338:321– 348
Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv:151107122
Xie H, Chen Y, Shin H (2019) Context-aware pedestrian detection especially for small-sized instances with deconvolution integrated faster RCNN (DIF R-CNN). Appl Intell 49(3):1200– 1211
Kampffmeyer M, Jenssen R (2019) Salberg A-B dense dilated convolutions merging network for semantic mapping of remote sensing images. In: 2019 joint urban remote sensing event (JURSE). IEEE, pp 1–4
Drozdzal M, Vorontsov E, Chartrand G, Kadoury S, Pal C (2016) The importance of skip connections in biomedical image segmentation. In: Deep learning and data labeling for medical applications. Springer, pp 179–187
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Zhang Z, Liu Q, Wang Y (2018) Road extraction by deep residual u-net. IEEE Geosci Remote Sens Lett 15(5):749–753
Ronneberger O, Fischer P, Brox T (2015) U-net:Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer, pp 234–241
Marmanis D, Schindler K, Wegner JD, Galliani S, Datcu M, Stilla U (2018) Classification with an edge:Improving semantic image segmentation with boundary detection. ISPRS J Photogramm Remote Sens 135:158–172
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 315– 323
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers:Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
Ioffe S, Szegedy C (2015) Batch normalization:Accelerating deep network training by reducing internal covariate shift. arXiv:150203167
Kingma DP, Ba J (2014) Adam:A method for stochastic optimization. arXiv:14126980
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout:a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab:Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Wang F, Xie J (2020) A context and semantic enhanced UNet for semantic segmentation of high-resolution aerial imagery. In: Journal of physics:conference series, vol 1. IOP Publishing, p 012083
Funding
This study was funded by National Key Research Program Project (No. 2016YFD0300610), the the Heilongjiang Province “Hundred Million” Engineering Science and Technology Major Special Project (No. 2019ZX14A04).
Author information
Authors and Affiliations
Contributions
Conceptualization, Z.S. and W.L.; methodology, Z.S.; software, W.L.; validation, W.L.; formal analysis, R.G. and Z.M.; investigation, Z.S. and W.L.; resources, R.G. and Z.M.; data curation, W.L.; writing—original draft preparation, W.L. and R.G.; writing—review and editing, W.L. and R.G.; visualization, W.L. and Z.M.; supervision, R.G.; project administration, Z.S.; funding acquisition, Z.S. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of Interests
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Availability of data and material
The data and materials generated or used during the research are available from the corresponding authors upon request.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Su, Z., Li, W., Ma, Z. et al. An improved U-Net method for the semantic segmentation of remote sensing images. Appl Intell 52, 3276–3288 (2022). https://doi.org/10.1007/s10489-021-02542-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02542-9