Skip to main content

DBDAN: Dual-Branch Dynamic Attention Network for Semantic Segmentation of Remote Sensing Images

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Abstract

Attention mechanism is capable to capture long-range dependence. However, its independent calculation of correlations can hardly consider the complex background of remote sensing images, which causes noisy and ambiguous attention weights. To address this issue, we design a correlation attention module (CAM) to enhance appropriate correlations and suppress erroneous ones by seeking consensus among all correlation vectors, which facilitates feature aggregation. Simultaneously, we introduce the CAM into a local dynamic attention (LDA) branch and a global dynamic attention (GDA) branch to obtain the information on local texture details and global context, respectively. In addition, considering the different demands of complex and diverse geographical objects for both local texture details and global context, we devise a dynamic weighting mechanism to adaptively adjust the contributions of both branches, thereby constructing a more discriminative feature representation. Experimental results on three datasets suggest that the proposed dual-branch dynamic attention network (DBDAN), which integrates the CAM and both branches, can considerably improve the performance for semantic segmentation of remote sensing images and outperform representative state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV, pp. 801–818 (2018)

    Google Scholar 

  2. Ding, L., Tang, H., Bruzzone, L.: LANet: local attention embedding to improve the semantic segmentation of remote sensing images. TGARS 59(1), 426–435 (2020)

    Google Scholar 

  3. Feng, Y., et al.: Npaloss: neighboring pixel affinity loss for semantic segmentation in high-resolution aerial imagery. ISPRS Ann. 5(2), 475–482 (2020)

    Google Scholar 

  4. Fu, J., et al.: Dual attention network for scene segmentation. In: CVPR, pp. 3146–3154 (2019)

    Google Scholar 

  5. Jin, Z., Liu, B., Chu, Q., Yu, N.: Isnet: integrate image-level and semantic-level context for semantic segmentation. In: ICCV, pp. 7189–7198 (2021)

    Google Scholar 

  6. Li, R., et al.: Multiattention network for semantic segmentation of fine-resolution remote sensing images. TGARS 60, 1–13 (2021)

    Google Scholar 

  7. Li, X., et al.: Dual attention deep fusion semantic segmentation networks of large-scale satellite remote-sensing images. IJRS 42(9), 3583–3610 (2021)

    Google Scholar 

  8. Li, X., Xu, F., Xia, R., Lyu, X., Gao, H., Tong, Y.: Hybridizing cross-level contextual and attentive representations for remote sensing imagery semantic segmentation. Remote Sens. 13(15), 2986 (2021)

    Article  Google Scholar 

  9. Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A convnet for the 2020s. In: CVPR, pp. 11976–11986 (2022)

    Google Scholar 

  10. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)

    Google Scholar 

  11. Ma, X., et al.: Sacanet: scene-aware class attention network for semantic segmentation of remote sensing images. arXiv preprint arXiv:2304.11424 (2023)

  12. Ma, X., et al.: Log-can: local-global class-aware network for semantic segmentation of remote sensing images. In: ICASSP2023, pp. 1–5. IEEE (2023)

    Google Scholar 

  13. Maboudi, M., Amini, J., Malihi, S., Hahn, M.: Integrating fuzzy object based image analysis and ant colony optimization for road extraction from remotely sensed images. ISPRS PRS 138, 151–163 (2018)

    Google Scholar 

  14. Marcos, D., Volpi, M., Kellenberger, B., Tuia, D.: Land cover mapping at very high resolution with rotation equivariant cnns: towards small yet accurate models. ISPRS PRS 145, 96–107 (2018)

    Google Scholar 

  15. Niu, R., Sun, X., Tian, Y., Diao, W., Chen, K., Fu, K.: Hybrid multiple attention network for semantic segmentation in aerial images. TGARS 60, 1–18 (2021)

    Google Scholar 

  16. Rottensteiner, F., et al.: International society for photogrammetry and remote sensing, 2d semantic labeling contest. Accessed 29 Oct 2020. https://www.isprs.org/education/benchmarks

  17. Song, M., Li, B., Wei, P., Shao, Z., Wang, J., Huang, J.: DMF-CL: dense multi-scale feature contrastive learning for semantic segmentation of remote-sensing images. In: PRCV 2022, pp. 152–164. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-18916-6_13

  18. Song, P., Li, J., An, Z., Fan, H., Fan, L.: CTMFNet: CNN and transformer multi-scale fusion network of remote sensing urban scene imagery. TGARS 61, 1–14 (2022)

    Google Scholar 

  19. Song, Q., Li, J., Li, C., Guo, H., Huang, R.: Fully attentional network for semantic segmentation. In: AAAI, vol. 36, pp. 2280–2288 (2022)

    Google Scholar 

  20. Strudel, R., Garcia, R., Laptev, I., Schmid, C.: Segmenter: transformer for semantic segmentation. In: ICCV, pp. 7262–7272 (2021)

    Google Scholar 

  21. Wang, J., Zheng, Z., Ma, A., Lu, X., Zhong, Y.: Loveda: a remote sensing land-cover dataset for domain adaptive semantic segmentation. arXiv preprint arXiv:2110.08733 (2021)

  22. Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: CVPR, pp. 7794–7803 (2018)

    Google Scholar 

  23. Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: ECCV, pp. 3–19 (2018)

    Google Scholar 

  24. Xu, Y., Jiang, J.: High-resolution boundary-constrained and context-enhanced network for remote sensing image segmentation. Remote Sens. 14(8), 1859 (2022)

    Article  Google Scholar 

  25. Yang, X., et al.: An attention-fused network for semantic segmentation of very-high-resolution remote sensing imagery. ISPRS PRS 177, 238–262 (2021)

    Google Scholar 

  26. Yu, W., et al.: Metaformer is actually what you need for vision. In: CVPR, pp. 10819–10829 (2022)

    Google Scholar 

  27. Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12351, pp. 173–190. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58539-6_11

    Chapter  Google Scholar 

  28. Zhang, Q., Seto, K.C.: Mapping urbanization dynamics at regional and global scales using multi-temporal DMSP/OLS nighttime light data. Remote Sens. Environ. 115(9), 2320–2329 (2011)

    Article  Google Scholar 

  29. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR, pp. 2881–2890 (2017)

    Google Scholar 

  30. Zheng, S., Lu, C., Wu, Y., Gupta, G.: Sapnet: segmentation-aware progressive network for perceptual contrastive deraining. In: WACV, pp. 52–62 (2022)

    Google Scholar 

  31. Zheng, Z., Zhong, Y., Wang, J., Ma, A.: Foreground-aware relation network for geospatial object segmentation in high spatial resolution remote sensing imagery. In: CVPR, pp. 4096–4105 (2020)

    Google Scholar 

  32. Zhu, L., Ji, D., Zhu, S., Gan, W., Wu, W., Yan, J.: Learning statistical texture for semantic segmentation. In: CVPR, pp. 12537–12546 (2021)

    Google Scholar 

  33. Zhu, L., Wang, X., Ke, Z., Zhang, W., Lau, R.W.: Biformer: vision transformer with bi-level routing attention. In: CVPR, pp. 10323–10333 (2023)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Che, R., Ma, X., Hong, T., Wang, X., Feng, T., Zhang, W. (2024). DBDAN: Dual-Branch Dynamic Attention Network for Semantic Segmentation of Remote Sensing Images. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14428. Springer, Singapore. https://doi.org/10.1007/978-981-99-8462-6_25

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8462-6_25

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8461-9

  • Online ISBN: 978-981-99-8462-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics