Global and edge enhanced transformer for semantic segmentation of remote sensing

Wang, Hengyou; Li, Xiao; Huo, Lianzhi; Hu, Changmiao

doi:10.1007/s10489-024-05457-3

Global and edge enhanced transformer for semantic segmentation of remote sensing

Published: 24 April 2024

Volume 54, pages 5658–5673, (2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Hengyou Wang ORCID: orcid.org/0000-0001-6693-0161¹,
Xiao Li¹,
Lianzhi Huo² &
…
Changmiao Hu²

534 Accesses
Explore all metrics

Abstract

Global context information and edge information are the keys to remote sensing (RS) image semantic segmentation. However, the existing methods have limited ability to obtain global and edge information, and category edge blurring and efficiency problems in small-scale object recognition in remote sensing image semantic segmentation tasks. In this work, we propose a global and edge enhanced Transformer (GE-Swin) for the semantic segmentation of remote sensing images. To improve the sensitivity to edge information, we design dual decoders based on the parallel model. One is the main decoder, which extracts multi-level semantic information from multi-scale features. The other is an auxiliary decoder related to low-layer features with low resolution. Thus, the auxiliary decoder has better sensitivity to edge information. Then, the feature fusion module (FFM) is designed between the encoder and decoder to fuse the multilevel features, enhancing the model’s ability to obtain global features. Finally, to verify the performance of the proposed approach, we perform extensive experiments with the ISPRS and LoveDA datasets. The experimental results illustrate that the proposed model achieves superior performance compared to state-of-the-art methods.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dual-Attention Fusion Network with Edge and Content Guidance for Remote Sensing Images Segmentation

GLE-net: global-local information enhancement for semantic segmentation of remote sensing images

Article Open access 25 October 2024

MCNet: A Multi-scale and Cascade Network for Semantic Segmentation of Remote Sensing Images

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability and access

All relevant results and data are within the manuscript. The data that support the findings of this study are all openly available in reference number [38,39,40].

References

Zhu Z, Zhang J, Yang Z, Aljaddani AH, Cohen WB, Qiu S, Zhou C (2020) Continuous monitoring of land disturbance based on landsat time series. Remote Sens Environ 238:111116
Article Google Scholar
Yu Y, Bao Y, Wang J, Chu H, Zhao N, He Y, Liu Y (2021) Crop row segmentation and detection in paddy fields based on treble-classification otsu and double-dimensional clustering method. Remote Sens 13(5):901
Article Google Scholar
Zhang J, Lin S, Ding L, Bruzzone L (2020) Multi-scale context aggregation for semantic segmentation of remote sensing images. Remote Sens 12(4):701
Article Google Scholar
Sun L, Zou H, Wei J, Li M, Cao X, He S, Liu S (2022) Semantic segmentation of high-resolution remote sensing images based on sparse self-attention. In: IGARSS 2022-2022 IEEE international geoscience and remote sensing symposium, IEEE, pp 3492–3495
Jin J, Zhou W, Yang R, Ye L, Yu L (2023) Edge detection guide network for semantic segmentation of remote-sensing images. IEEE Geosci Remote Sens Lett 20:1–5
Google Scholar
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv:2010.11929
Wang W, Tang C, Wang X, Zheng B (2022) A vit-based multiscale feature fusion approach for remote sensing image segmentation. IEEE Geosci Remote Sens Lett 19:1–5
Google Scholar
Zhong HF, Sun Q, Sun HM, Jia RS (2022) Nt-net: A semantic segmentation network for extracting lake water bodies from optical remote sensing images based on transformer. IEEE Trans Geosci Remote Sens 60:1–13
Article Google Scholar
Li Y, Ouyang S, Zhang Y (2022) Combining deep learning and ontology reasoning for remote sensing image semantic segmentation. Knowl-Based Syst 243:108469
Article Google Scholar
Wang L, Li R, Zhang C, Fang S, Duan C, Meng X, Atkinson PM (2022) Unetformer: A unet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery. ISPRS J Photogramm Remote Sens 190:196–214
Article Google Scholar
Zhang C, Lu X, Ye Q, Wang C, Yang C, Wang Q (2022) Mfenet: Multi-feature extraction net for remote sensing semantic segmentation. In: 2022 7th International conference on intelligent computing and signal processing (ICSP), IEEE, pp 1986–1990
Liu R, Mi L, Chen Z (2020) Afnet: Adaptive fusion network for remote sensing image semantic segmentation. IEEE Trans Geosci Remote Sens 59(9):7871–7886
Article Google Scholar
Zheng Z, Zhong Y, Wang J, Ma A (2020) Foreground-aware relation network for geospatial object segmentation in high spatial resolution remote sensing imagery. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4096–4105
Diakogiannis FI, Waldner F, Caccetta P, Wu C (2020) Resunet-a: A deep learning framework for semantic segmentation of remotely sensed data. ISPRS J Photogramm Remote Sens 162:94–114
Article Google Scholar
Xiao T, Liu Y, Huang Y, Li M, Yang G (2023) Enhancing multiscale representations with transformer for remote sensing image semantic segmentation. IEEE Trans Geosci Remote Sens 61:1–16
Google Scholar
Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6881–6890
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022
Ding L, Lin D, Lin S, Zhang J, Cui X, Wang Y, Tang H, Bruzzone L (2022) Looking outside the window: Wide-context transformer for the semantic segmentation of high-resolution remote sensing images. IEEE Trans Geosci Remote Sens 60:1–13
Google Scholar
Xu Z, Zhang W, Zhang T, Yang Z, Li J (2021) Efficient transformer for remote sensing image segmentation. Remote Sens 13(18):3585
Article Google Scholar
Zhang Y, Gao X, Duan Q, Yuan L, Gao X (2022) Dht: Deformable hybrid transformer for aerial image segmentation. IEEE Geosci Remote Sens Lett 19:1–5
Google Scholar
Ye W, Zhang W, Lei W, Zhang W, Chen X, Wang Y (2023) Remote sensing image instance segmentation network with transformer and multi-scale feature representation. Expert Syst Appl 234:121007
Article Google Scholar
Gao L, Liu H, Yang M, Chen L, Wan Y, Xiao Z, Qian Y (2021) Stransfuse: Fusing swin transformer and convolutional neural network for remote sensing image semantic segmentation. IEEE J Sel Top Appl Earth Obs Remote Sens 14:10990–11003
Article Google Scholar
He X, Zhou Y, Zhao J, Zhang D, Yao R, Xue Y (2022) Swin transformer embedding unet for remote sensing image semantic segmentation. IEEE Trans Geosci Remote Sens 60:1–15
Article Google Scholar
Meng X, Yang Y, Wang L, Wang T, Li R, Zhang C (2022) Class-guided swin transformer for semantic segmentation of remote sensing imagery. IEEE Geosci Remote Sens Lett 1–5
Feng D, Zhang Z, Yan K (2022) A semantic segmentation method for remote sensing images based on the swin transformer fusion gabor filter. IEEE Access 10:77432–77451
Article Google Scholar
Zhang C, Jiang W, Zhang Y, Wang W, Zhao Q, Wang C (2022) Transformer and cnn hybrid deep neural network for semantic segmentation of very-high-resolution remote sensing imagery. IEEE Trans Geosci Remote Sens 60:1–20
Google Scholar
Wang L, Li R, Duan C, Zhang C, Meng X, Fang S (2022) A novel transformer based semantic segmentation scheme for fine-resolution remote sensing images. IEEE Geosci Remote Sens Lett 19:1–5
Google Scholar
Dong Z, Gao G, Liu T, Gu Y, Zhang X (2023) Distilling segmenters from cnns and transformers for remote sensing images semantic segmentation. IEEE Trans Geosci Remote Sens
Chu X, Tian Z, Wang Y, Zhang B, Ren H, Wei X, Xia H, Shen C (2021) Twins: Revisiting the design of spatial attention in vision transformers. Adv Neural Inf Process Syst 34:9355–9366
Google Scholar
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3146–3154
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890
Strudel R, Garcia R, Laptev I, Schmid C (2021) Segmenter: Transformer for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7262–7272
Xie E, Wang W, Yu Z, Anandkumar A, Alvarez JM, Luo P (2021) Segformer: Simple and efficient design for semantic segmentation with transformers. Adv Neural Inf Process Syst 34:12077–12090
Google Scholar
Nong Z, Su X, Liu Y, Zhan Z, Yuan Q (2021) Boundary-aware dual-stream network for vhr remote sensing images semantic segmentation. IEEE J Sel Top Appl Earth Obs Remote Sens 14:5260–5268
Article Google Scholar
Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801–818
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Jin Z, Yu D, Song L, Yuan Z, Yu L (2022) You should look at all objects. In: European conference on computer vision, Springer, pp 332–349
Vaihingen I (2018) 2d semantic labeling dataset. Accessed: Apr
Potsdam I (2018) 2d semantic labeling dataset. Accessed: Apr
Wang J, Zheng Z, Ma A, Lu X, Zhong Y (2021) Loveda: A remote sensing land-cover dataset for domain adaptive semantic segmentation. arXiv:2110.08733

Download references

Author information

Authors and Affiliations

Beijing University of Civil Engineering and Architecture, Beijing, 100044, China
Hengyou Wang & Xiao Li
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100094, China
Lianzhi Huo & Changmiao Hu

Authors

Hengyou Wang
View author publications
You can also search for this author inPubMed Google Scholar
Xiao Li
View author publications
You can also search for this author inPubMed Google Scholar
Lianzhi Huo
View author publications
You can also search for this author inPubMed Google Scholar
Changmiao Hu
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Conceptualization and methodology, Hengyou Wang and Xiao Li; Software, experiments, and validation, Xiao Li and Changmiao Hu; Writing-original draft preparation, Xiao Li and Changmiao Hu; Writing-review and editing, Hengyou Wang and Lianzhi Huo; Visualization, Xiao Li and Lianzhi Huo; Project administration, Hengyou Wang; Funding acquisition, Hengyou Wang and Lianzhi Huo. All authors have read and agreed to this version of the manuscript.

Corresponding author

Correspondence to Lianzhi Huo.

Ethics declarations

Ethical and informed consent for the data used

The authors declare no potential conflicts of interest or ethical problems relate to the data used. The data we used are all avaliable to researchers.

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by the National Natural Science Foundation of China (Nos. 62072024, 41971396), the outstanding Youth Program of Beijing University of Civil Engineering and Architecture(No.JDJQ20220805), the BUCEA Post Graduate Innovation Project.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, H., Li, X., Huo, L. et al. Global and edge enhanced transformer for semantic segmentation of remote sensing. Appl Intell 54, 5658–5673 (2024). https://doi.org/10.1007/s10489-024-05457-3

Download citation

Accepted: 07 April 2024
Published: 24 April 2024
Issue Date: April 2024
DOI: https://doi.org/10.1007/s10489-024-05457-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Global and edge enhanced transformer for semantic segmentation of remote sensing

Abstract

Graphical abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Dual-Attention Fusion Network with Edge and Content Guidance for Remote Sensing Images Segmentation

GLE-net: global-local information enhancement for semantic segmentation of remote sensing images

MCNet: A Multi-scale and Cascade Network for Semantic Segmentation of Remote Sensing Images

Explore related subjects

Data availability and access

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical and informed consent for the data used

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now