Skip to main content
Log in

A feature enhancement network combining UNet and vision transformer for building change detection in high-resolution remote sensing images

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Building change detection (CD) is significant for understanding ground changes and human activities. Deep learning has become the mainstream approach for building CD. However, the detection accuracy remains insufficient due to limitations in feature extraction. Therefore, this paper proposes a feature enhancement network, FENET-UEVTS, to improve the accuracy of building detection, which combines a UNet encoder and a vision transformer structure to detect building changes. It can enhance the ability to detect irregular buildings and distinguish changes between adjacent buildings in different locations. The model combines a deep convolutional network with a part of vision transformer structure, which has a robust feature extraction ability for various types of buildings. We design a spatial-channel attention mechanism module (SCAM) that takes into account both the spatial and channel dimensions to enhance the detection ability of small-scale buildings. We also develop a u-shaped residual module (USRM) and a strengthened feature extraction module (SFEM) to improve the feature extraction capability for buildings with different shapes and edge details. A self-attention feature fusion module (SAFFM) is proposed to facilitate the full convergence and integration of different feature information. The SAFFM can better distinguish buildings of various shapes and sizes to prevent false detection and missed detection. To minimize information loss, a cross-channel context semantic aggregation module (CCSAM) is designed to perform information aggregation in the channel dimension. To evaluate the performance of our model, we conducted numerous experiments on three CD datasets. The results demonstrate that our proposed model outperforms eight other state-of-the-art (SOTA) algorithms in F1-score, overall accuracy, and KAPPA coefficient, achieving up to 91.83 %, 87.65 %, and 93.29 % F1-score on three widely used public datasets, i.e., LEVIR-CD, WHU-CD, and CDD dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Code Availability

The source code for the implemented algorithms is available upon request. Please contact the corresponding author for access.

References

  1. Singh A (1989) Review article digital change detection techniques using remotely-sensed data. Int J Remote Sens 10(6):989–1003

    Article  Google Scholar 

  2. Mohammadi Babak, Pilesjö Petter, Duan Zheng (2023) The superiority of the adjusted normalized difference snow index (ANDSI) for mapping glaciers using Sentinel-2 multispectral satellite imagery. GIScience Remote Sens 60(1):2257978

    Article  Google Scholar 

  3. Rathore MM, Ahmad A, Paul A, Rho S (2016) Urban planning and building smart cities based on the Internet of Things using Big Data analytics. Comput Netw 101:63–80

    Article  Google Scholar 

  4. Jing-Fa Z, Li-li X, Xia-xin T (2003) Change detection of earthquake-damaged buildings on remote sensing image and its application in seismic disaster assessment. In: Proceedings IEEE International Geoscience Remote Sensing Symposium (IGARSS) Toulouse, France USA 4: 2436–2438

  5. Shi W, Zhang M, Zhang R, Chen S, Zhan Z (2020) Change detection based on artificial intelligence: state-of-the-art and challenges. Remote Sens 12(10):1688

    Article  MATH  Google Scholar 

  6. Zhang L, Huang Z, Liu W, Guo Z, Zhang Z (2021) Weather radar echo prediction method based on convolution neural network and long short-term memory networks for sustainable e-agriculture. J Clean Prod 298:126776

    Article  Google Scholar 

  7. Ridd MK, Liu J (1998) A comparison of four algorithms for change detection in an urban environment. Remote Sens Environ 63(2):95–100

    Article  MATH  Google Scholar 

  8. Radke RJ, Andra S, Al-Kofahi O, Roysam B (2005) Image change detection algorithms: a systematic survey. IEEE Trans Image Process 14(3):294–307

    Article  MathSciNet  Google Scholar 

  9. Chini M, Pierdicca N, Emery WJ (2009) Exploiting SAR and VHR optical images to quantify damage caused by the 2003 Bam earthquake. IEEE Trans Geosci Remote Sens 47(1):145–152

    Article  Google Scholar 

  10. Gao F, Dong J, Li B, Xu Q, Xie C (2016) Change detection from synthetic aperture radar images based on neighborhood-based ratio and extreme learning machine. J Appl Remote Sens 10(4):046019

    Article  MATH  Google Scholar 

  11. Zhang L, Zhang L, Du B (2016) Deep learning for remote sensing data: a technical tutorial on the state of the art. IEEE Geosci Remote Sen M 4(2):22–40

    Article  MATH  Google Scholar 

  12. Lv Z, Huang H, Sun W, Jia M, Benediktsson JA, Chen F (2023) Iterative training sample augmentation for enhancing land cover change detection performance with deep learning neural network. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2023.3282935

    Article  Google Scholar 

  13. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  MATH  Google Scholar 

  14. Deng J, Dong W, Socher R, Li L, Li K, FeiFei L (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255

  15. Lv Z, Zhong P, Wang W, You Z, Falco N (2023) Multiscale attention network guided with change gradient image for land cover change detection using remote sensing images. IEEE Geosci Remote Sens Lett 20:1–5

    Google Scholar 

  16. Lv Z, Zhang P, Sun W, Benediktsson JA, Li J, Wang W (2023) Novel adaptive region spectral-spatial features for land cover classification with high spatial resolution remotely sensed imagery. IEEE Trans Geosci Remote Sens 61:1–12

    Google Scholar 

  17. Huang W, Li G, Chen Q, Ju M, Qu J (2021) CF2PN: a cross-scale feature fusion pyramid network based remote sensing target detection. Remote Sens 13(5):847

    Article  MATH  Google Scholar 

  18. Lv Z, Zhong P, Wang W, You Z, Benediktsson JA, Shi C (2023) Novel piecewise distance based on adaptive region key-points extraction for LCCD with VHR remote-sensing images. IEEE Trans Geosci Remote Sens 61:1–9

    Google Scholar 

  19. Sun L, Zhao G, Zheng Y, Wu Z (2022) Spectral-spatial feature tokenization transformer for hyperspectral image classification. IEEE Trans Geosci Remote Sens 60:1–14

    Article  MATH  Google Scholar 

  20. Sun L, Fang Y, Chen Y, Huang W, Wu Z, Jeon B (2022) Multi-structure KELM with attention fusion strategy for hyperspectral image classification. IEEE Trans Geosci Remote Sens 60:1–17

    Article  MATH  Google Scholar 

  21. Du Z, Li X, Miao J, Huang Y, Shen H, Zhang L (2024) Concatenated deep-learning framework for multitask change detection of optical and SAR images. IEEE J Select Topics Appl Earth Obs Remote Sens 17:719–731

    Article  MATH  Google Scholar 

  22. Zhang Y, Qiu Z, Yao T, Liu D, Mei T (2018) Fully convolutional adaptation networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6810–6818

  23. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440

  24. Lebedev MA, Vizilter YV, Vygolov OV, Knyaz VA, Rubis AY (2018) Change detection in remote sensing images using conditional adversarial networks. Int Arch Photogramm Remote Sens Spatial Inf Sci 42:565–571

    Article  Google Scholar 

  25. Zhang A, Liu X, Gros A, Tiecke T (2017) Building detection from satellite images on a global scale. https://arxiv.org/abs/1707.08952v1

  26. Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, pp. 234–241

  27. Daudt RC, Saux BL, Boulch A (2018) Fully convolutional Siamese networks for change detection. In: 2018 25th IEEE international conference on image processing (ICIP), pp. 4063–4067

  28. Peng D, Zhang Y, Guan H (2019) End-to-end change detection for high resolution satellite images using improved UNet++. Remote Sens 11(11):1382

    Article  MATH  Google Scholar 

  29. Oktay O et al. (2018) Attention U-Net: learning where to look for the pancreas. https://doi.org/10.48550/arXiv.1804.03999

  30. Howard A, Pang R, Adam H, Le QV, Sandler M, Chen B, Wang W, Chen L, Tan M, Chu G, Vasudevan V, Zhu Y (2019) Searching for MobileNetV3. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 1314–1324

  31. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) DeepLab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  MATH  Google Scholar 

  32. Peng D, Bruzzone L, Zhang Y, Guan H, He P (2021) SCDNET: a novel convolutional network for semantic change detection in high resolution optical remote sensing imagery. Int J Appl Earth Obs Geoinf 103:102465

    Google Scholar 

  33. Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, pp. 234-241

  34. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778

  35. Han C, Wu C, Guo H, Hu M, Li J, Chen H (2023) Change guiding network: incorporating change prior to guide change detection in remote sensing imagery. IEEE J Select Topics Appl Earth Obs Remote Sens 16:8395–8407

    Article  Google Scholar 

  36. Vaswani A et al. (2017) Attention is all you need. In: 31st conference on neural information processing systems (NIPS 2017), 30: 5998–6008

  37. Xie E, Wang W, Yu Z, Anandkumar A, Alvarez JM, Luo P (2021) SegFormer: simple and efficient design for semantic segmentation with transformers. Neural Inf Process Syst 34:12077–12090

    MATH  Google Scholar 

  38. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An iImage is worth 16 \(\times \) 16 words: transformers for image recognition at scale. https://arxiv.org/abs/2010.11929

  39. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022

  40. Mehta S, Rastegari M (2021) MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer. https://arxiv.org/abs/2110.02178

  41. Xiao X, Guo W, Chen R, Hui Y, Wang J, Zhao H (2022) A Swin transformer-based encoding booster integrated in U-shaped network for building extraction. Remote Sens 14(11):2611

    Article  MATH  Google Scholar 

  42. Wang G, Li B, Zhang T, Zhang S (2022) A network combining a transformer and a convolutional neural network for remote sensing image change detection. Remote Sens 14(9):2228

    Article  MATH  Google Scholar 

  43. Mnih V, Heess N, Graves A, Kavukcuoglu K (2014) Recurrent models of visual attention. In: Proceedings of the 27th International conference on neural information processing systems 27, 2204–2212

  44. Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV) pp. 3–19

  45. Wang L, Peng J, Sun W (2019) Spatial-spectral squeeze-and-excitation residual network for hyperspectral image classification. Remote Sens 11(7):884

    Article  MATH  Google Scholar 

  46. Wang F et al. (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3156–3164

  47. Fu J et al. (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3146–3154

  48. Ding Q, Shao Z, Huang X, Altan O (2021) DSA-Net: a novel deeply supervised attention-guided network for building change detection in high-resolution remote sensing images. Int J Appl Earth Obs Geoinf 105:102591

    Google Scholar 

  49. Song L, Xia M, Jin J, Qian M, Zhang Y (2021) SUACDNet: attentional change detection network based on Siamese U-shaped structure. Int J Appl Earth Obs Geoinf 105:102597

    MATH  Google Scholar 

  50. Baldi P, Sadowski P (2013) Understanding dropout. In: Proceedings of the 27th International conference on neural information processing systems 26: 2814–2822

  51. Sudre CH, Li W, Vercauteren T, Ourselin S, Jorge Cardoso M (2017) Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In: Deep learning in medical image analysis and multimodal learning for clinical decision support: third international workshop, DLMIA 2017, and 7th international workshop, ML-CDS pp. 240–248

  52. Lin TY, Goyal P, Girshick R, He K, Dollár P (2020) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 42(2):318–327

    Article  Google Scholar 

  53. Li X, He M, Li H, Shen H (2021) A combined loss-based multiscale fully convolutional network for high-resolution remote sensing image change detection. IEEE Geosci Remote Sens Lett 19:1–5

    MATH  Google Scholar 

  54. Chen H, Shi Z (2020) A spatial-temporal attention-based method and a new dataset for remote sensing image change detection. Remote Sens 12(10):1662

    Article  MATH  Google Scholar 

  55. Ji S, Wei S, Lu M (2019) Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Trans Geosci Remote Sens 57(1):574–586

    Article  MATH  Google Scholar 

  56. Lebedev MA, Vizilter YV, Vygolov OV, Knyaz VA, Rubis AY (2018) Change detection in remote sensing images using conditional adversarial networks. Int Arch Photogramm Remote Sens Spatial Inf Sci 42:565–571

    Article  Google Scholar 

  57. Liu Y, Pang C, Zhan Z, Zhang X, Yang X (2021) Building change detection for remote sensing images using a dual-task constrained deep Siamese convolutional network model. IEEE Geosci Remote Sens Lett 18(5):811–815

    Article  MATH  Google Scholar 

  58. Fang S, Li K, Shao J, Li Z (2022) SNUNET-CD: a densely connected Siamese network for change detection of VHR images. IEEE Geosci Remote Sens Lett 19:1–5

    MATH  Google Scholar 

  59. Chen H, Qi Z, Shi Z (2022) Remote sensing image change detection with transformers. IEEE Trans Geosci Remote Sens 60:1–14

    MATH  Google Scholar 

  60. Bandara WGC, Patel VM (2022) A transformer-based Siamese network for change detection. In: IGARSS 2022-2022 IEEE International geoscience and remote sensing symposium pp. 207–210

  61. Guo Q, Wang R, Huang R, Sun S, Zhang Y (2022) IDET: iterative difference-enhanced transformers for high-quality change detection. https://doi.org/10.48550/arXiv.2207.09240

  62. Lei T, Geng X, Ning H, Lv Z, Gong M, Jin Y, Nandi AK (2023) Ultralightweight spatial-spectral feature cooperation network for change detection in remote sensing images. IEEE Trans Geosci Remote Sens 61:1–14

    Article  Google Scholar 

  63. Atasever UH, Gunen MA, Besdok E (2018) A new unsupervised change detection approach based on PCA based blocking and GMM clustering for detecting flood damage. Fresenius Environ Bull 27(3):1688–1694

    MATH  Google Scholar 

Download references

Funding

This work was supported by the Key Research and Development and Promotion Program of Henan Province (No. 232102210115).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xianwei Han.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

This manuscript is approved by all authors for publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, Y., Zhao, Y., Han, X. et al. A feature enhancement network combining UNet and vision transformer for building change detection in high-resolution remote sensing images. Neural Comput & Applic 37, 1429–1456 (2025). https://doi.org/10.1007/s00521-024-10666-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-024-10666-5

Keywords