A rotation-invariant horizontal vertical pooled module for remote sensing image representation

Sitaula, Chiranjibi; Aryal, Jagannath

doi:10.1007/s00521-024-10180-8

A rotation-invariant horizontal vertical pooled module for remote sensing image representation

Original Article
Published: 30 July 2024

Volume 36, pages 18661–18673, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

178 Accesses
Explore all metrics

Abstract

Accurate information retrieval from multi-source and multi-resolution image data constitutes a foundation for knowledge discovery. Scene image classification in the remote sensing (RS) community using aerial very high resolution (VHR) images is one of the well-researched areas, which mostly utilise deep learning (DL)—based methods thanks to their remarkable classification performance. Nevertheless, existing DL-based methods still have a limited ability to capture precise spatial semantic information scattered toward the horizontal and vertical directions across such images at multiple scales and rotations. As such, we herein propose a novel approach, employing an innovative rotation invariant horizontal vertical pooled module (RIHVPM), to well-represent aerial VHR RS images for stable and improved classification performance. Notably, the proposed RIHVPM benefits from the multiple tensor rotations coupled with attention-enabled multiscale horizontal and vertical pooling operations for image representation. An experimental study on three benchmark datasets demonstrates competent and/or higher classification performance (AID: 96.44%, NWPU: 94.32% and UCM: 99.04%) and robustness/stability (minimum standard deviation of 0.001) of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Exploiting representations from pre-trained convolutional neural networks for high-resolution remote sensing image retrieval

Article 02 November 2017

Enhanced multi-level features for very high resolution remote sensing scene classification

Article 21 February 2024

Exploiting low dimensional features from the MobileNets for remote sensing image retrieval

Article 29 June 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availibility statement

All three datasets used in this paper are publicly available. UCM: http://weegee.vision.ucmerced.edu/datasets/landuse.html; AID: https://captain-whu.github.io/AID/ and NWPU: https://gcheng-nwpu.github.io/.

References

Sitaula C, Shahi TB, Marzbanrad F, Aryal J (2023) Recent advances in scene image representation and classification. Multimed Tools Appl 83:1–28
Google Scholar
Sitaula C, Aryal J, Bhattacharya A (2023) A novel multiscale attention feature extraction block for aerial remote sensing image classification. IEEE Geosci Remote Sens Lett 20:1–5
Article Google Scholar
Cao R, Fang L, Lu T, He N (2021) Self-attention-based deep feature fusion for remote sensing scene classification. IEEE Geosci Remote Sens Lett 18(1):43–47
Article Google Scholar
Wang X, Duan L, Shi A, Zhou H (2022) Multilevel feature fusion networks with adaptive channel dimensionality reduction for remote sensing scene classification. IEEE Geosci Remote Sens Lett 19:1–5
Google Scholar
Weng Q, Mao Z, Lin J, Guo W (2017) Land-use classification via extreme learning classifier based on deep convolutional features. IEEE Geosci Remote Sens Lett 14(5):704–708
Article Google Scholar
Yu Y, Liu F (2018) A two-stream deep fusion framework for high-resolution aerial scene classification. Comput Intell Neurosci 2018:1–13
Google Scholar
Sun X, Zhu Q, Qin Q (2021) A multi-level convolution pyramid semantic fusion framework for high-resolution remote sensing image scene classification and annotation. IEEE Access 9:18195–18208
Article Google Scholar
Simonyan K, Zisserman A (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A(2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
He N, Fang L, Li S, Plaza A, Plaza J (2018) Remote sensing scene classification using multilayer stacked covariance pooling. IEEE Trans Geosci Remote Sens 56(12):6899–6910
Article Google Scholar
Xu K, Deng P, Huang H (2023) Mining hierarchical information of CNNS for scene classification of VHR remote sensing images. IEEE Trans Big Data 9(2):542–554
Article Google Scholar
Ma J, Lin W, Tang X, Zhang X, Liu F, Jiao L (2023) Multipretext-task prototypes guided dynamic contrastive learning network for few-shot remote sensing scene classification. IEEE Trans Geosci Remote Sens 61:1–16
Google Scholar
Geng J, Xue B, Jiang W (2023) Foreground-background contrastive learning for few-shot remote sensing image scene classification. IEEE Trans Geosci Remote Sens 61:1–12
Article Google Scholar
Lv H, Qian W, Chen T, Yang H, Zhou X (2022) Multiscale feature adaptive fusion for object detection in optical remote sensing images. IEEE Geosci Remote Sens Lett 19:1–5
Google Scholar
Huang Y, Li X, Du Z, Shen H (2024) Spatiotemporal enhancement and interlevel fusion network for remote sensing images change detection. IEEE Trans Geosci Remote Sens 62:1–14
Google Scholar
Wang Q, Liu S, Chanussot J, Li X (2018) Scene classification with recurrent attention of VHR remote sensing images. IEEE Trans Geosci Remote Sens 57(2):1155–1167
Article Google Scholar
He N, Fang L, Li S, Plaza J, Plaza A (2019) Skip-connected covariance network for remote sensing scene classification. IEEE Trans Neural Netw Learn Syst 31(5):1461–1474
Article Google Scholar
Wang S, Guan Y, Shao L (2020) Multi-granularity canonical appearance pooling for remote sensing scene classification. IEEE Trans Image Process 29:5396–5407
Article MathSciNet Google Scholar
Wang Q, Huang W, Xiong Z, Li X (2022) Looking closer at the scene: multiscale representation learning for remote sensing image scene classification. IEEE Trans Neural Netw Learn Syst 33(4):1414–1428
Article Google Scholar
Guo J, Jia N, Bai J (2022) Transformer based on channel-spatial attention for accurate classification of scenes in remote sensing image. Sci Rep 12(1):15473
Article Google Scholar
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth $16\times 16$ words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Zhang J, Zhao H, Li J (2021) TRS: transformers for remote sensing scene classification. Remote Sens 13(20):4143
Article Google Scholar
Sitaula C, Kc S, Aryal J (2024) Enhanced multi-level features for very high resolution remote sensing scene classification. Neural Comput Appl 36(13):1–13
Article Google Scholar
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, pp 248–255
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Park J-Y, Hwang Y, Lee D, Kim J-H (2020) Marsnet: multi-label classification network for images of various sizes. IEEE Access 8:21832–21846
Article Google Scholar
Shi J, Liu W, Shan H, Li E, Li X, Zhang L (2023) Remote sensing scene classification based on multibranch fusion attention network. IEEE Geosci Remote Sens Lett 20:1–5
Google Scholar
Xia G-S, Hu J, Hu F, Shi B, Bai X, Zhong Y, Zhang L, Lu X (2017) AID: a benchmark data set for performance evaluation of aerial scene classification. IEEE Trans Geosci Remote Sens 55(7):3965–3981
Article Google Scholar
Cheng G, Han J, Lu X (2017) Remote sensing image scene classification: benchmark and state of the art. Proc IEEE 105(10):1865–1883
Article Google Scholar
Yang Y, Newsam S (2012) Geographic image retrieval using local invariant features. IEEE Trans Geosci Remote Sens 51(2):818–832
Article Google Scholar
Chollet F, et al (2024) Keras. https://github.com/fchollet/keras
Rossum G (1995) Python reference manual. In: Technical report, Amsterdam, The Netherlands
Mundu A (2024) GFLOP in Keras. https://github.com/tensorflow/tensorflow/issues/32809. Accessed on 18 May 2024
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2818–2826
Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the AAAI conference on artificial intelligence
Howard A.G, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H(2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Tan M, Le Q (2019) Efficientnet: Rethinking model scaling for convolutional neural networks. In: Proceedings of the international conference on machine learning, pp 6105–6114
Dekking FM (2005) A modern introduction to probability and statistics: understanding why and how. Springer, New York
Book Google Scholar

Download references

Acknowledgements

Jagannath Aryal is supported by the University of Melbourne (UoM), internal funding for this research. Further, this research, which is the outcome of the postdoctoral research work of the first author, was supported by the UOM’s Research Computing Services and the Petascale Campus Initiative.

Author information

Chiranjibi Sitaula and Jagannath Aryal have contributed equally to this work.

Authors and Affiliations

Earth Observation and AI Research Group, Department of Infrastructure Engineering, The University of Melbourne, 700 Swanton Street, Carlton, VIC, 3053, Australia
Chiranjibi Sitaula & Jagannath Aryal

Authors

Chiranjibi Sitaula
View author publications
You can also search for this author inPubMed Google Scholar
Jagannath Aryal
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

CS involved in data curation, conceptualization, methodology, software, writing, original draft preparation, writing review and editing. JA involved in supervision, writing, review, validation, resources, and project administration

Corresponding author

Correspondence to Chiranjibi Sitaula.

Ethics declarations

Conflict of interest

The authors have no Conflict of interest to declare that are relevant to the content of this article.

Code availability

The source code will be made available upon request by the journal and will be publicly available after the publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sitaula, C., Aryal, J. A rotation-invariant horizontal vertical pooled module for remote sensing image representation. Neural Comput & Applic 36, 18661–18673 (2024). https://doi.org/10.1007/s00521-024-10180-8

Download citation

Received: 18 March 2024
Accepted: 01 July 2024
Published: 30 July 2024
Issue Date: October 2024
DOI: https://doi.org/10.1007/s00521-024-10180-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A rotation-invariant horizontal vertical pooled module for remote sensing image representation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Exploiting representations from pre-trained convolutional neural networks for high-resolution remote sensing image retrieval

Enhanced multi-level features for very high resolution remote sensing scene classification

Exploiting low dimensional features from the MobileNets for remote sensing image retrieval

Explore related subjects

Data availibility statement

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now