Hybrid first and second order attention Unet for building segmentation in remote sensing images

He, Nanjun; Fang, Leyuan; Plaza, Antonio

doi:10.1007/s11432-019-2791-7

Hybrid first and second order attention Unet for building segmentation in remote sensing images

Research Paper
Published: 09 March 2020

Volume 63, article number 140305, (2020)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Nanjun He¹,
Leyuan Fang¹ &
Antonio Plaza²

868 Accesses
75 Citations
Explore all metrics

Abstract

Recently, building segmentation (BS) has drawn significant attention in remote sensing applications. Convolutional neural networks (CNNs) have become the mainstream analysis approach in this field owing to their powerful representative ability. However, owing to the variation in building appearance, designing an effective CNN architecture for BS still remains a challenging task. Most of CNN-based BS methods mainly focus on deep or wide network architectures, neglecting the correlation among intermediate features. To address this problem, in this paper we propose a hybrid first and second order attention network (HFSA) that explores both the global mean and the inner-product among different channels to adaptively rescale intermediate features. As a result, the HFSA can not only make full use of first order feature statistics, but also incorporate the second order feature statistics, which leads to more representative feature. We conduct a series of comprehensive experiments on three widely used aerial building segmentation data sets and one satellite building segmentation data set. The experimental results show that our newly developed model achieves better segmentation performance over state-of-the-art models in terms of both quantitative and qualitative results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficiency analysis of ITN loss function for deep semantic building segmentation

Article 09 March 2024

Mohammad Erfan Omati & Fatemeh Tabib Mahmoudi

CT-UNet: Context-Transfer-UNet for Building Segmentation in Remote Sensing Images

Article 02 August 2021

Sheng Liu, Huanran Ye, … Haohao Cheng

HF-FCN: Hierarchically Fused Fully Convolutional Network for Robust Building Extraction

References

Jensen J R, Cowen D C. Remote sensing of urban suburban infrastructure and socio-economic attributes. Photogramm Eng Remote Sens, 1999, 65: 611–622
Google Scholar
Yuan J. Learning building extraction in aerial scenes with convolutional networks. IEEE Trans Pattern Anal Mach Intell, 2018, 40: 2793–2798
Article Google Scholar
Liow Y T, Pavlidis T. Use of shadows for extracting buildings in aerial images. Comput Vision Graph Image Process, 1990, 49: 242–277
Article MATH Google Scholar
Ok A O. Automated detection of buildings from single VHR multispectral images using shadow information and graph cuts. ISPRS J Photogrammetry Remote Sens, 2013, 86: 21–40
Article Google Scholar
Inglada J. Automatic recognition of man-made objects in high resolution optical remote sensing images by SVM classification of geometric image features. ISPRS J Photogrammetry Remote Sens, 2007, 62: 236–248
Article Google Scholar
Karantzalos K, Paragios N. Recognition-driven two-dimensional competing priors toward automatic and accurate building detection. IEEE Trans Geosci Remote Sens, 2009, 47: 133–144
Article Google Scholar
Kim T, Muller J. Development of a graph-based approach for building detection. Image Vision Comput, 1999, 17: 3–14
Article Google Scholar
Li E, Femiani J, Xu S, et al. Robust rooftop extraction from visible band images using higher order CRF. IEEE Trans Geosci Remote Sens, 2015, 53: 4483–4495
Article Google Scholar
Yang H L, Yuan J, Lunga D, et al. Building extraction at scale using convolutional neural network: mapping of the united states. IEEE J Sel Top Appl Earth Observ Remote Sens, 2018, 11: 2600–2614
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, 2012. 1097–1105
Zhou Q, Wang Y, Liu J, et al. An open-source project for real-time image semantic segmentation. Sci China Inf Sci, 2019, 62: 227101
Article Google Scholar
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. 3431–3440
Wang W, Gao W, Hu Z Y. Effectively modeling piecewise planar urban scenes based on structure priors and CNN. Sci China Inf Sci, 2019, 62: 029102
Article Google Scholar
Ronneberger O, Fischer P, Brox T. Unet: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015. Berlin: Springer, 2015. 234–241
Google Scholar
Lu Y H, Zhen M M, Fang T. Multi-view based neural network for semantic segmentation on 3D scenes. Sci China Inf Sci, 2019, 62: 229101
Article Google Scholar
Badrinarayanan V, Kendall A, Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 2481–2495
Article Google Scholar
Geng Q C, Zhou Z, Cao X C. Survey of recent progress in semantic image segmentation with CNNs. Sci China Inf Sci, 2018, 61: 051101
Article MathSciNet Google Scholar
Haut J M, Paoletti M E, Plaza J, et al. Visual attention-driven hyperspectral image classification. IEEE Trans Geosci Remote Sens, 2019, 57: 8065–8080
Article Google Scholar
He N, Fang L, Li S, et al. Remote sensing scene classification using multilayer stacked covariance pooling. IEEE Trans Geosci Remote Sens, 2018, 56: 6899–6910
Article Google Scholar
He N, Fang L, Li S, et al. Skip-connected covariance network for remote sensing scene classification. IEEE Trans Neural Netw Learn Syst, 2019. doi: https://doi.org/10.1109/TNNLS.2019.2920374
Lin T Y, Maji S. Improved bilinear pooling with CNNs. In: Proceedings of British Machine Vision Conference (BMVC), 2017
Lin T Y, RoyChowdhury A, Maji S. Bilinear CNN models for fine-grained visual recognition. In: Proceedings of Internation Conference of Computer Vision (ICCV), 2015. 1449–1457
Mnih V. Machine learning for aerial image labeling. Dissertation for Ph.D. Degree. Toronto: University of Toronto, 2013
Google Scholar
Ji S, Wei S, Lu M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Trans Geosci Remote Sens, 2019, 57: 574–586
Article Google Scholar
Maggiori E, Tarabalka Y, Charpiat G, et al. Can semantic labeling methods generalize to any city? The inria aerial image labeling benchmark. In: Proceedings of IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, 2017. 3226–3229

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (Grant Nos. 61922029, 61771192), National Natural Science Foundation of China for International Cooperation and Exchanges (Grant No. 61520106001), and Huxiang Young Talents Plan Project of Hunan Province (Grant No. 2019RS2016).

Author information

Authors and Affiliations

College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China
Nanjun He & Leyuan Fang
Hyperspectral Computing Laboratory, Department of Technology of Computers and Communications, Escuela Politecnica, University of Extremadura, Extremadura, E-10003, Spain
Antonio Plaza

Authors

Nanjun He
View author publications
You can also search for this author in PubMed Google Scholar
Leyuan Fang
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Plaza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Leyuan Fang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, N., Fang, L. & Plaza, A. Hybrid first and second order attention Unet for building segmentation in remote sensing images. Sci. China Inf. Sci. 63, 140305 (2020). https://doi.org/10.1007/s11432-019-2791-7

Download citation

Received: 01 November 2019
Revised: 31 December 2019
Accepted: 11 February 2020
Published: 09 March 2020
DOI: https://doi.org/10.1007/s11432-019-2791-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hybrid first and second order attention Unet for building segmentation in remote sensing images

Abstract

Access this article

Similar content being viewed by others

Efficiency analysis of ITN loss function for deep semantic building segmentation

CT-UNet: Context-Transfer-UNet for Building Segmentation in Remote Sensing Images

HF-FCN: Hierarchically Fused Fully Convolutional Network for Robust Building Extraction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hybrid first and second order attention Unet for building segmentation in remote sensing images

Abstract

Access this article

Similar content being viewed by others

Efficiency analysis of ITN loss function for deep semantic building segmentation

CT-UNet: Context-Transfer-UNet for Building Segmentation in Remote Sensing Images

HF-FCN: Hierarchically Fused Fully Convolutional Network for Robust Building Extraction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation