Abstract
Due to the complexity of object information and optical conditions of high-resolution aerial imagery, it is difficult to obtain fine semantic segmentation performance. Although various deep neural network structures have been proposed to improve segmentation accuracy, there is still room for improving accuracy by making full use of multiscale features and integrating these single weak classifiers into a strong classifier. In this paper, we use a reduced SegNet network to realize the end-to-end classification of high-resolution aerial images. In addition, to use multiscale information, we present the R-SegUnet which combines the feature information of each convolution block in the reduced SegNet encoding network with the feature information of the corresponding convolution block in the decoding network. Furthermore, considering that the surface features in high-resolution aerial images are very complex, we investigate a 6to2_Net that converts the six-classification model into six binary-classification models for the recognition effect on small objects. Finally, we ensemble the above three different models to get the segmentation results. Experiment results on ISPRS Potsdam benchmark dataset show that our algorithm is state-of-the-art method. We also analyze the inference performance of our models on a variety of parallel computing devices.









Similar content being viewed by others
Data availability
The data that support the findings of this study is available upon request from the authors.
References
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder–decoder architecture for scene segmentation. IEEE Trans. Pattern Anal. Mach. Intell.intell. 39(12), 2481–2495 (2017)
Campos, D., Kieu, T., Guo, C., Huang, F., Zheng, K., Yang, B., Jensen, C.S.: Unsupervised time series outlier detection with diversity-driven convolutional ensembles. Proc. VLDB Endow. 15, 611–623 (2021)
Carreira, J., Caseiro, R., Batista, J.: Semantic segmentation with second-order pooling. In: European Conference on Computer Vision, pp. 430–443 (2012)
Chen, K., Fu, K., Yan, M., et al.: Semantic segmentation of aerial images with shuffling convolutional neural networks. IEEE Geosci. Remote Sens. Lett.geosci. Remote Sens. Lett. 15(2), 173–177 (2018)
Cui, B., Jing, W.-P., Huang, L., Li, Z., Yan, Lu.: SANet: a sea–land segmentation network via adaptive multiscale feature learning. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 14, 116–126 (2021)
Dong, X., Yu, Z., Cao, W.: A survey on ensemble learning. Front. Comput. Sci. 14(2), 241–258 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conf. Comput. Vis. Pattern Recog., pp. 770–778 (2016)
Huang, C., Deng Yin, Y., Zeng, H.: Segmentation algorithm of road scene based on full convolutional network and conditional random field. In: 2019 2nd International Conference on Information Systems and Computer Aided Education, pp. 270–273 (2019)
Huang, G., Zhu, J., Li, J., et al.: Channel-attention U-Net: channel attention mechanism for semantic segmentation of esophagus and esophageal cancer. IEEE Access 8, 122798–122810 (2020)
Inglada, J.: Automatic recognition of man-made objects in high resolution optical remote sending images by SVM classification of geometric image features. ISPRS J. Photogramm. Remote Sens.photogramm. Remote Sens. 63(3), 236–248 (2007)
Li, X., Li, T., Chen, Z., Zhang, K., Xia, R.: Attentively learning edge distributions for semantic segmentation of remote sensing imagery. Remote Sens. 14(1), 102 (2022)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Mao, H.-Z., Song, Y., Tang, T.-Q., et al.: Towards real-time object detection on embedded systems. IEEE Trans. Emerg. Top. Comput.emerg. Top. Comput. 6(3), 417–431 (2018)
Mariana, B., Lucian, D.: Random forest in remote sensing: a review of applications and future directions. ISPRS J. Photogramm. Remote Sens.photogramm. Remote Sens. 14(6), 24–31 (2016)
Mou, L., Hua, Y., Zhu, X.: Relation matters: relational context-aware fully convolutional network for semantic segmentation of high-resolution aerial images. IEEE Trans. Geosci. Remote Sens.geosci. Remote Sens. 58, 7557–7569 (2020)
Niu, R., Sun, X., Tian, Y., Diao, W., Chen, K., Fu, K.: Hybrid multiple attention network for semantic segmentation in aerial images. IEEE Trans. Geosci. Remote Sens.geosci. Remote Sens. 60, 1–18 (2022)
Park, J., Naumov, M., Basu, P., et al.: Deep learning inference in facebook data centers: characterization, performance optimizations and hardware implications. arXiv preprint arXiv:1811.09886 (2018)
Peng, C., Li, Y.-Y., Jiao, L.-C., et al.: Densely based multi-scale and multi-modal fully convolutional networks for high-resolution remote-sensing image semantic segmentation. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 12(8), 2612–2626 (2019)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241 (2015)
Sherrah, J.: Fully convolutional networks for dense semantic labelling of high-resolution aerial imagery. arXiv preprint arXiv:1606.02585 (2016)
Taghanaki, S.A., Abhishek, K., Cohen, J.P., Cohen-Adad, J., Hamarneh, G.: Deep semantic segmentation of natural andmedical images: a review. Artif. Intell. Rev.. Intell. Rev. 54, 137–178 (2020)
Wang, Y., Gu, Y.-F., He, X., et al.: Deep learning ensemble for hyperspectral image classification. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 12(6), 1882–1897 (2019)
Weng, W., Zhu, X.: INet: convolutional networks for biomedical image segmentation. IEEE Access 9, 16591–16603 (2021)
Xiao, X., Zhao, Y., Zhang, F., et al.: BASeg: boundary aware semantic segmentation for autonomous driving. Neural Netw.netw. 157, 460–470 (2023)
Yang, R., Zhang, Y., Cheng, H., Zhao, Y., Dai, Q., Chen, N.: Semantic segmentation of remote sensing image based on two-time augmentation and atrous convolution. In: 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), pp. 1728–1734 (2021)
Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. In: Proceedings of the 16th European Conference on Computer Vision, pp. 173–190 (2020)
Funding
This work was supported in part by the Key Research and Development Program of Shaanxi Program under Grant 2022ZDLGY01-09, and in part by the GHfund A under Grant 202107014474 and Grant 202202036165.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhu, H., Liu, C., Li, Q. et al. Deep convolutional encoder–decoder networks based on ensemble learning for semantic segmentation of high-resolution aerial imagery. CCF Trans. HPC 6, 408–424 (2024). https://doi.org/10.1007/s42514-024-00184-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42514-024-00184-0