Skip to main content
Log in

Multihead attention mechanism guided ConvLSTM for pixel-level segmentation of ocean remote sensing images

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Semantic segmentation of ocean remote sensing images classifies each pixel in the image according to the ocean background and island type, and is an important research direction in the field of remote sensing image processing. Due to large differences in the scale of islands in ocean remote sensing images and the complexity of island boundaries, it is difficult to accurately extract features of ocean remote sensing images, which makes it difficult to accurately segment ocean remote sensing images. Convolutional neural networks have gradually become the mainstream algorithm in the field of image processing due to their autonomous hierarchical extraction of image features. In this paper, the MAGC-Net neural network model, which is based on the multihead attention mechanism and ConvLSTM, is used to segment ocean remote sensing images to improve the accuracy of semantic segmentation. First, shallow features are obtained via multiscale convolution, and multiple weights are assigned to features by the multihead attention mechanism (global, local, maximum). Then, the semantic relationship between the features is described through the integrated ConvLSTM module, and deep features are generated. Finally, deep features are filtered through residual blocks, reducing redundant features and improving segmentation accuracy. Experimental results with the NWPU-RESISC45 dataset demonstrate the effectiveness and robustness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Cai W, Wei Z (2020) Remote sensing image classification based on a cross-attention mechanism and graph convolution. IEEE Geosci Remote Sens Lett

  2. Cai W, Liu B, Wei Z, Li M, Kan J (2020) MBDA-net: triple-attention guided residual dense and BiLSTM networks for hyperspec-tral image classification. Multimed Tools Appl 80:11291–11312. https://doi.org/10.1007/s11042-020-10188-x

    Article  Google Scholar 

  3. Camarretta N, Harrison PA, Bailey T, Potts B, Lucieer A, Davidson N, Hunt M (2020) Monitoring forest structure to guide adaptive management of forest restoration: a review of remote sensing approaches. New For 51(4):573–596

    Article  Google Scholar 

  4. Chang Y, Luo B (2019) Bidirectional convolutional LSTM neural network for remote sensing image super-resolution. Remote Sens 11(20):2333

    Article  Google Scholar 

  5. Chen J, Wan L, Zhu J, Xu G, Deng M (2019) Multi-scale spatial and channel-wise attention for improving object detection in remote sensing imagery. IEEE Geosci Remote Sens Lett 17(4):681–685

    Article  Google Scholar 

  6. Fu J, Liu J, Tian H et al (2019) Dual attention network for scene segmentation[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3146–3154

    Google Scholar 

  7. Gao H, Cao L, Yu D, Xiong X, Cao M (2020) Semantic segmentation of marine remote sensing based on a cross direction attention mechanism. IEEE Access 8:142483–142494

    Article  Google Scholar 

  8. Hatfield JL, Cryder M, Basso B (2020) Remote sensing: advancing the science and the applications to transform agriculture. IT Professional 22(3):42–45

    Article  Google Scholar 

  9. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

    Google Scholar 

  10. Hu X, Yang K, Fei L et al (2019) Acnet: attention based network to exploit complementary features for rgbd semantic segmentation[C]//2019. In: IEEE international conference on image processing (ICIP). IEEE, pp 1440–1444

    Google Scholar 

  11. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708

    Google Scholar 

  12. Jing R, Liu S, Gong Z, Wang Z, Guan H, Gautam A, Zhao W (2020) Object-based change detection for VHR remote sensing images based on a Trisiamese-LSTM. Int J Remote Sens 41(16):6209–6231

    Article  Google Scholar 

  13. Khanal S, Fulton J, Shearer S (2017) An overview of current and potential applications of thermal remote sensing in precision agriculture. Comput Electron Agric 139:22–32

    Article  Google Scholar 

  14. Kpienbaareh D, Luginaah I (2020) Modeling the internal structure, dynamics and trends of urban sprawl in Ghanaian cities using remote sensing, spatial metrics and spatial analysis. Afr Geogr Rev 39(3):189–207

    Google Scholar 

  15. Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  16. Lu X, Wang W, Ma C, Shen J, Shao L, Porikli F (2019) See more, know more: unsupervised video object segmentation with co-attention siamese networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3623–3632

    Google Scholar 

  17. Lu X, Wang W, Shen J, Crandall D, Luo J (2020) Zero-shot video object segmentation with co-attention Siamese networks. IEEE Trans Pattern Anal Mach Intell 44:1–2242

    Google Scholar 

  18. Lu X, Wang W, Danelljan M, Zhou T, Shen J, Van Gool L (2020) Video object segmentation with episodic graph memory networks. In: Computer vision–ECCV 2020 16th European conference, Glasgow, UK, august 23–28, 2020, proceedings, part II, vol 12348. Springer International Publishing, pp 661–679

    Google Scholar 

  19. Lu X, Wang W, Shen J, Tai YW, Crandall DJ, Hoi SC (2020) Learning video object segmentation from unlabeled videos. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8960–8970

    Google Scholar 

  20. Nhamo L, Ebrahim GY, Mabhaudhi T, Mpandeli S, Magombeyi M, Chitakira M, Magidi J, Sibanda M (2020) An assessment of groundwater use in irrigated agriculture using multi-spectral remote sensing. Physics and Chemistry of the Earth, Parts A/B/C 115:102810

    Article  Google Scholar 

  21. Ning X, Duan P, Li W, Zhang S (2020) Real-time 3D face alignment using an encoder-decoder network with an efficient deconvolution layer. IEEE Signal Process Lett 27:1944–1948

    Article  Google Scholar 

  22. Ning X, Wang Y, Tian W, Liu L, Cai W (2021) A biomimetic covering learning method based on principle of homology continuity. ASP Transactions on Pattern Recognition and Intelligent Systems 1(1):9–16

    Article  Google Scholar 

  23. Padmanaban R, Bhowmik AK, Cabral P (2017) A remote sensing approach to environmental monitoring in a reclaimed mine area. ISPRS Int J Geo Inf 6(12):401

    Article  Google Scholar 

  24. Qi C, Huang S, Wang X (2020) Monitoring water quality parameters of Taihu Lake based on remote sensing images and LSTM-RNN. IEEE Access 8:188068–188081

    Article  Google Scholar 

  25. Qi X, Li K, Liu P, Zhou X, Sun M (2020) Deep attention and multi-scale networks for accurate remote sensing image segmentation. IEEE Access 8:146627–146639

    Article  Google Scholar 

  26. Rao Z, He M, Zhu Z, Dai Y, He R (2020) Bidirectional guided attention network for 3-d semantic detection of remote sensing images. IEEE Trans Geosci Remote Sens

  27. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, Cham, pp 234–241

    Google Scholar 

  28. Rudke AP, de Souza VAS, dos Santos AM, Xavier ACF, Rotunno Filho OC, Martins JA (2020) Impact of mining activities on areas of environmental protection in the southwest of the Amazon: a GIS-and remote sensing-based assessment. J Environ Manag 263:110392

    Article  Google Scholar 

  29. Sannigrahi S, Pilla F, Basu B, Basu AS, Sarkar K, Chakraborti S, Joshi PK, Zhang Q, Wang Y, Bhatt S, Bhatt A, Jha S, Keesstra S, Roy PS (2020) Examining the effects of forest fire on terrestrial carbon emission and ecosystem production in India using remote sensing approaches. Sci Total Environ 138331:138331

    Article  Google Scholar 

  30. Shi X, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting. Adv Neural Inf Proces Syst 28:802–810

    Google Scholar 

  31. Sun S, Mu L, Wang L, Liu P (2020) L-UNet: an LSTM network for remote sensing image change detection. IEEE Geosci Remote Sens Lett

  32. Tong Y, Yu L, Li S, Liu J, Qin H, Li W (2021) Polynomial fitting algorithm based on neural network. ASP Transactions on Pattern Recognition and Intelligent Systems 1(1):32–39

    Article  Google Scholar 

  33. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  34. Wang Z, Zou C, Cai W (2020) Small sample classification of hyperspectral remote sensing images based on sequential joint deeping learning model. IEEE Access 8:71353–71363

    Article  Google Scholar 

  35. Weiss M, Jacob F, Duveiller G (2020) Remote sensing for agricultural applications: a meta-review. Remote Sens Environ 236:111402

    Article  Google Scholar 

  36. Wellmann T, Lausch A, Andersson E, Knapp S, Cortinovis C, Jache J, Scheuer S, Kremer P, Mascarenhas A, Kraemer R, Haase A, Schug F, Haase D (2020) Remote sensing in urban planning: contributions towards ecologically sound policies? Landsc Urban Plan 204:103921

    Article  Google Scholar 

  37. Yang ZL, Zhang SY, Hu YT, Hu ZW, Huang YF (2020) VAE-Stega: linguistic steganography based on variational auto-encoder. IEEE Trans Inf Forensics Secur 16:880–895

    Article  Google Scholar 

  38. You H, Tian S, Yu L, Lv Y (2019) Pixel-level remote sensing image recognition based on bidirectional word vectors. IEEE Trans Geosci Remote Sens 58(2):1281–1293

    Article  Google Scholar 

  39. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, Cham, pp 818–833

    Google Scholar 

  40. Zhang X, Wang X, Tang X, Zhou H, Li C (2019) Description generation for remote sensing images using attribute attention mechanism. Remote Sens 11(6):612

    Article  Google Scholar 

  41. Zhong Z, Jin L, Xie Z (2015) High performance offline handwritten chinese character recognition using googlenet and directional feature maps. In: 2015 13th international conference on document analysis and recognition (ICDAR). IEEE, pp 846–850

    Chapter  Google Scholar 

  42. Zhu R, Yan L, Mo N, Liu Y (2019) Attention-based deep feature fusion for the scene classification of high-resolution remote sensing images. Remote Sens 11(17):1996

    Article  Google Scholar 

  43. Zhu Y, Feng Z, Lu J, Liu J (2020) Estimation of forest biomass in Beijing (China) using multisource remote sensing and forest inventory data. Forests 11(2):163

    Article  Google Scholar 

Download references

Acknowledgments

This study was supported by the Project of Shandong Province Higher Educational Science and Technology Program under grant number J18KA394 and the Project of Binzhou University Doctoral Research under grant number 2016Y19.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuai Pang.

Ethics declarations

Conflict of interest

Author Shuai Pang and Author Lianxue Gao declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pang, S., Gao, L. Multihead attention mechanism guided ConvLSTM for pixel-level segmentation of ocean remote sensing images. Multimed Tools Appl 81, 24627–24643 (2022). https://doi.org/10.1007/s11042-022-12849-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12849-5

Keywords

Navigation