Remote Sensing Scene Classification Based on Covariance Pooling of Multi-layer CNN Features Guided by Saliency Maps

Akodad, Sara; Bombrun, Lionel; Germain, Christian; Berthoumieu, Yannick

doi:10.1007/978-3-031-09037-0_47

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13363))

Included in the following conference series:

International Conference on Pattern Recognition and Artificial Intelligence

1619 Accesses
1 Citations

Abstract

The new generation of remote sensing imaging sensors enables high spatial, spectral and temporal resolution images with high revisit frequencies. These sensors allow the acquisition of multi-spectral and multi-temporal images. The availability of these data has raised the interest of the remote sensing community to develop novel machine learning strategies for supervised classification. This paper aims at introducing a novel supervised classification algorithm based on covariance pooling of multi-layer convolutional neural network (CNN) features. The basic idea consists in an ensemble learning approach based on covariance matrices estimation from CNN features. Then, after being projected on the log-Euclidean space, an SVM classifier is used to make a decision. In order to give more strength to relatively small objects of interest in the scene, we propose to incorporate the visual saliency map in the process. For that, inspired by the theory of robust statistics, a weighted covariance matrix estimator is considered. Larger weights are given to more salient regions. Finally, some experiments on remote sensing classification are conducted on the UC Merced land use dataset. The obtained results confirm the potential of the proposed approach in terms of classification scene accuracy. It demonstrates, besides the interest of exploiting second order statistics and adopting an ensemble learning approach, the benefit of incorporating visual saliency maps.

Supported by CNES TEMPOSS project.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their location in images. In: Tenth IEEE International Conference on Computer Vision (ICCV 2005), October 2005, vol. 1, pp. 370–377 (2005)
Google Scholar
Arandjelović, R., Zisserman, A.: All about VLAD. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1578–1585 (2013)
Google Scholar
Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)
Google Scholar
Douze, M., Ramisa, A., Schmid, C.: Combining attributes and Fisher vectors for efficient image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 745–752 (2011)
Google Scholar
Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.: Image classification with the Fisher vector: theory and practice. Int. J. Comp. Vis. 105(3), 222–245 (2013)
Article MATH Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, vol. 1, pp. 1097–1105, ser. NIPS 2012. Curran Associates Inc. (2012)
Google Scholar
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep Fisher networks for large-scale image classification. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, vol. 1, pp. 163–171, ser. NIPS 2013. Curran Associates Inc. (2013)
Google Scholar
Arandjelovic, R., Gronát, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. CoRR, vol. abs/1511.07247 (2015)
Google Scholar
Li, E., Xia, J., Du, P., Lin, C., Samat, A.: Integrating multilayer features of convolutional neural networks for remote sensing scene classification. IEEE Trans. Geosci. Remote Sens. 55(10), 5653–5665 (2017)
Article Google Scholar
Faraki, M., Harandi, M.T., Porikli, F.: More about VLAD: a leap from Euclidean to Riemannian manifolds. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4951–4960 (2015)
Google Scholar
Barachant, A., Bonnet, S., Congedo, M., Jutten, C.: Classification of covariance matrices using a Riemannian-based kernel for BCI applications. Neuro Comput. 112, 172–178 (2013)
Google Scholar
Said, S., Bombrun, L., Berthoumieu, Y.: Texture classification using Rao’s distance on the space of covariance matrices. In: Nielsen, F., Barbaresco, F. (eds.) GSI 2015. LNCS, vol. 9389, pp. 371–378. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25040-3_40
Chapter MATH Google Scholar
Faraki, M., Harandi, M.T., Wiliem, A., Lovell, B.C.: Fisher tensors for classifying human epithelial cells. Pattern Recogn. 47(7), 2348–2359 (2014)
Article Google Scholar
Akodad, S., Bombrun, L., Yaacoub, C., Berthoumieu, Y., Germain, C.: Image classification based on log-Euclidean Fisher vectors for covariance matrix descriptors. In: International Conference on Image Processing Theory, Tools and Applications (IPTA), Xi-an, China, November 2018
Google Scholar
Ilea, I., Bombrun, L., Said, S., Berthoumieu, Y.: Covariance matrices encoding based on the log-Euclidean and affine invariant Riemannian metrics. In: IEEE CVPRW, pp. 506–515, June 2018
Google Scholar
Ionescu, C., Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers. In: IEEE International Conference on Computer Vision (ICCV), pp. 2965–2973 (2015)
Google Scholar
Acharya, D., Huang, Z., Paudel, D.P., Gool, L.V.: Covariance pooling for facial expression recognition. CoRR, vol. abs/1805.04855 (2018)
Google Scholar
Huang, Z., Gool, L.V.: A Riemannian network for SPD matrix learning. In: AAAI Conference on Artificial Intelligence, pp. 2036–2042 (2017)
Google Scholar
Yu, K., Salzmann, M.: Second-order convolutional neural networks. CoRR, vol. abs/1703.06817 (2017)
Google Scholar
He, N., Fang, L., Li, S., Plaza, A., Plaza, J.: Remote sensing scene classification using multilayer stacked covariance pooling. IEEE Trans. Geosci. Remote Sens. 56(12), 6899–6910 (2018)
Article Google Scholar
Akodad, S., Vilfroy, S., Bombrun, L., Cavalcante, C.C., Germain, C., Berthoumieu, Y.: An ensemble learning approach for the classification of remote sensing scenes based on covariance pooling of CNN features. In: 2019 27th European Signal Processing Conference (EUSIPCO), September 2019, pp. 1–5 (2019)
Google Scholar
Akodad, S., Bombrun, L., Xia, J., Berthoumieu, Y., Germain, C.: Ensemble learning approaches based on covariance pooling of CNN features for high resolution remote sensing scene classification. Remote Sens. 12, 3292 (2020)
Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254–1259 (2009)
Article Google Scholar
Cong, R., Lei, J., Fu, H., Cheng, M., Lin, W., Huang, Q.: Review of visual saliency detection with comprehensive information. CoRR, vol. abs/1803.03391 (2018)
Google Scholar
He, S., Lau, R.W.H., Liu, W., Huang, Z., Yang, Q.: SuperCNN: a superpixelwise convolutional neural network for salient object detection. Int. J. Comp. Vis. 115(3), 330–344 (2015)
Article Google Scholar
Liu, N., Han, J.: DHSNet: deep hierarchical saliency network for salient object detection. In: 2016 IEEE CVPR, June 2016, pp. 678–686 (2016)
Google Scholar
Pan, J., et al.: SalGAN: visual saliency prediction with generative adversarial networks. CoRR, vol. abs/1701.01081 (2017)
Google Scholar
Arsigny, V., Fillard, P., Pennec, X., Ayache, N.: Log-Euclidean metrics for fast and simple calculus on diffusion tensors. Magn. Reson. Med. 56(2), 411–421 (2006)
Article Google Scholar
Moosmann, F., Larlus, D., Jurie, F.: Learning saliency maps for object categorization. In: International Workshop on The Representation and Use of Prior Knowledge in Vision (in ECCV 2006), Graz, Austria, May 2006
Google Scholar
Gao, D., Vasconcelos, N.: Discriminant saliency for visual recognition from cluttered scenes. In: NIPS, vol. 17, January 2004
Google Scholar
Yang, Y., Newsam, S.: Bag-of-visual-words and spatial extensions for land-use classification. In: Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, ser. GIS 2010, pp. 270–279. ACM, New York (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Université de Bordeaux, CNRS, IMS, UMR 5218, Groupe Signal et Image, 33405, Talence, France
Sara Akodad, Lionel Bombrun, Christian Germain & Yannick Berthoumieu

Authors

Sara Akodad
View author publications
You can also search for this author in PubMed Google Scholar
Lionel Bombrun
View author publications
You can also search for this author in PubMed Google Scholar
Christian Germain
View author publications
You can also search for this author in PubMed Google Scholar
Yannick Berthoumieu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sara Akodad .

Editor information

Editors and Affiliations

Télécom SudParis, Palaiseau, France
Mounîm El Yacoubi
École de Technologie Supérieure, Montreal, QC, Canada
Eric Granger
Hong Kong Baptist University, Kowloon, Kowloon, Hong Kong
Pong Chi Yuen
Indian Statistical Institute, Kolkata, India
Umapada Pal
Université Paris Cité, Paris, France
Nicole Vincent

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Akodad, S., Bombrun, L., Germain, C., Berthoumieu, Y. (2022). Remote Sensing Scene Classification Based on Covariance Pooling of Multi-layer CNN Features Guided by Saliency Maps. In: El Yacoubi, M., Granger, E., Yuen, P.C., Pal, U., Vincent, N. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2022. Lecture Notes in Computer Science, vol 13363. Springer, Cham. https://doi.org/10.1007/978-3-031-09037-0_47

Download citation

DOI: https://doi.org/10.1007/978-3-031-09037-0_47
Published: 02 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09036-3
Online ISBN: 978-3-031-09037-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics