Skip to main content
Log in

Exploiting representations from pre-trained convolutional neural networks for high-resolution remote sensing image retrieval

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the increasing amount of high-resolution remote sensing images, it becomes more and more urgent to retrieve remote sensing images from large archives efficiently. The existing methods are mainly based on shallow features to retrieve images, while shallow features are easily affected by artificial intervention. Recently, convolutional neural networks (CNNs) are capable of learning feature representations automatically, and CNNs pre-trained on large-scale datasets are generic. This paper exploits representations from pre-trained CNNs for high-resolution remote sensing image retrieval. CNN representations from AlexNet, VGGM, VGG16, and GoogLeNet are first transferred for high-resolution remote sensing images, and then CNN features are extracted via two approaches. One is extracting the outputs of high-level layers directly and the other is aggregating the outputs of mid-level layers by means of average pooling with different pooling regions. Given the generalization and high dimensionality of the CNN features, feature combination and feature compression are also adopted to improve the feature representation. Experimental results demonstrate that aggregated features with pooling region smaller than the feature map size perform excellently, especially for VGG16 and GoogLeNet. Shallow feature makes a great contribution to enhance the retrieval precision when combined with CNN features, and compressed features reduce redundancy effectively. Compared with the state-of-the-art methods, the proposed feature extraction methods are very simple, and the features are able to improve retrieval performance significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Aptoula E (2014) Remote sensing image retrieval with global morphological texture descriptors. IEEE Trans Geosci Remote Sens 52(5):3023–3034. https://doi.org/10.1109/TGRS.2013.2268736

    Article  Google Scholar 

  2. Babenko A, Lempitsky V (2015) Aggregating local deep convolutional features for image retrieval. In: 15th IEEE international conference on computer vision, Santiago, Chile, pp 1269–1277. https://doi.org/10.1109/ICCV.2015.150

  3. Babenko A, Slesarev A, Chigorin A, Lempitsky V (2014) Neural codes for image retrieval. In: 13th european conference on computer vision, Zurich, Switzerland, pp 584–599. https://doi.org/10.1007/978-3-319-10590-1_38

  4. Bai S, Li Z, Hou J (2017) Learning two-pathway convolutional neural networks for categorizing scene images. Multimedia Tools and Applications 76(15):16145–16162. https://doi.org/10.1007/s11042-016-3900-6

    Article  Google Scholar 

  5. Bretschneider T, Cavet R, Kao O (2002) Retrieval of remotely sensed imagery using spectral information content. In: IEEE international geoscience and remote sensing symposium, Toronto, Canada, pp 2253–2255. https://doi.org/10.1109/IGARSS.2002.1026510

  6. Castelluccio M, Poggi G, Sansone C, Verdoliva L (2015) Land use classification in remote sensing images by convolutional neural networks. Acta Ecol Sin 28(2):627–635

    Google Scholar 

  7. Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional networks. In: 25th british machine vision conference, Nottingham, England. https://doi.org/10.5244/C.28.6

  8. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition, San Diego, California, USA, pp 886–893. https://doi.org/10.1109/CVPR.2005.177

  9. Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2):5. https://doi.org/10.1145/1348246.1348248

    Article  Google Scholar 

  10. Demir B, Bruzzone L (2015) A novel active learning method in relevance feedback for content-based remote sensing image retrieval. IEEE Trans Geosci Remote Sens 53(5):2323–2334. https://doi.org/10.1109/TGRS.2014.2358804

    Article  Google Scholar 

  11. Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: 31st international conference on machine learning, Beijing, China, pp 647–655

  12. Du Z, Li X, Lu X (2016) Local structure learning in high resolution remote sensing image retrieval. Neurocomputing 207:813–822. https://doi.org/10.1016/j.neocom.2016.05.061

    Article  Google Scholar 

  13. Ferecatu M, Boujemaa N (2007) Interactive remote-sensing image retrieval using active relevance feedback. IEEE Trans Geosci Remote Sens 45(4):818–826. https://doi.org/10.1109/TGRS.2007.892007

    Article  Google Scholar 

  14. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 27th IEEE conference on computer vision and pattern recognition, Columbus, USA, pp 580–587. https://doi.org/10.1109/CVPR.2014.81

  15. He Z, You X, Yuan Y (2009) Texture image retrieval based on non-tensor product wavelet filter banks. Signal Process 89(8):1501–1510. https://doi.org/10.1016/j.sigpro.2009.01.021

    Article  MATH  Google Scholar 

  16. Hongyu Y, Bicheng L, Wen C (2004) Remote sensing imagery retrieval based-on gabor texture feature classification. In: 7th international conference on signal processing, pp 733–736. https://doi.org/10.1109/ICOSP.2004.1452767

  17. Hu F, Tong X, Xia G, Zhang L (2016) Delving into deep representations for remote sensing image retrieval. In: 13th IEEE international conference on signal processing, Chengdu, China, pp 198–203. https://doi.org/10.1109/ICSP.2016.7877823

  18. Hu F, Xia G, Hu J, Zhang L (2015) Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery. Remote Sens 7:14680–14707. https://doi.org/10.3390/rs71114680

    Article  Google Scholar 

  19. Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact representation. In: IEEE conference on computer vision and pattern recognition, San Francisco, California, USA, pp 3304–3311. https://doi.org/10.1109/CVPR.2010.5540039

  20. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: 26th conference on neural information processing systems, Nevada, US

  21. Liu T, Zhang L, Li P, Lin H (2012) Remotely sensed image retrieval based on region-level semantic mining. EURASIP Journal on Image and Video Processing 4(1):1–11. https://doi.org/10.1186/1687-5281-2012-4

    Google Scholar 

  22. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94

    Article  Google Scholar 

  23. Mousavian A, Zisserman J (2015) Deep convolutional features for image based retrieval and scene categorization. arXiv:1509.06033

  24. Napoletano P (2016) Visual descriptors for content-based retrieval of remote sensing images. arXiv:1602.00970v1

  25. Ng JY, Yang F, Davis LS (2015) Exploiting local features from deep networks for image. In: 28th IEEE conference on computer vision and pattern recognition workshops, Boston, MA, pp 53–61. https://doi.org/10.1109/CVPRW.2015.7301272

  26. Ong EJ, Husain S, Bober M (2017) Siamese network of deep fisher-vector descriptors for image retrieval. arXiv:1702.00338

  27. Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: 27th IEEE conference on computer vision and pattern recognition, Columbus, USA, pp 1717–1724. https://doi.org/10.1109/CVPR.2014.222

  28. Ozkan S, Ates T, Tola E, Soysal M, Esen E (2014) Performance analysis of state-of-the-art representation methods for geographical image retrieval and categorization. IEEE Geosci Remote Sens Lett 11(11):1996–2000. https://doi.org/10.1109/LGRS.2014.2316143

    Article  Google Scholar 

  29. Penatti OAB, Nogueira K, Santos JAD (2015) Do deep features generalize from everyday objects to remote sensing and aerial scenes domains. In: 28th IEEE conference on computer vision and pattern recognition, Boston, MA, pp 44–51. https://doi.org/10.1109/CVPRW.2015.7301382

  30. Perronnin F, Sánchez J, Mensink T (2010) Improving the fisher kernel for large-scale image classification. In: 11th european conference on computer vision, Heraklion, Crete, Greece, pp 143–156. https://doi.org/10.1007/978-3-642-15561-1_11

  31. Razavian A, Azizpour H, Sullivan J, Carlsson S (2014) Cnn features off-the-shelf: an astounding baseline for recognition. In: 27th IEEE conference on computer vision and pattern recognition, Columbus, USA, pp 512–519. https://doi.org/10.1109/CVPRW.2014.131

  32. Scott G, Klaric M, Davis C, Shyu CR (2011) Entropy-balanced bitmap tree for shape-based object retrieval from large-scale satellite imagery databases. IEEE Trans Geosci Remote Sens 49(5):1603–1616. https://doi.org/10.1109/TGRS.2010.2088404

    Article  Google Scholar 

  33. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: 3rd international conference on learning representations, San Diego, California, USA

  34. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V (2015) Going deeper with convolutions. In: 28th IEEE conference on computer vision and pattern recognition, Boston, MA, pp 1–9. https://doi.org/10.1109/CVPR.2015.7298594

  35. Uricchio T, Bertini M, Seidenari L, Bimbo AD (2015) Fisher encoded convolutional bag-of-windows for efficient image retrieval and social image tagging. In: 15th IEEE international conference on computer vision workshop, Santiago, Chile, pp 1020–1026. https://doi.org/10.1109/ICCVW.2015.134

  36. Vedaldi A, Lenc K (2015) Matconvnet: convolutional neural networks for MATLAB. In: 23rd ACM international conference on multimedia. Brisbane, Austrialia, pp 689–692. https://doi.org/10.1145/2733373.2807412

  37. Wang M, Song T (2013) Remote sensing image retrieval by scene semantic matching. IEEE Trans Geosci Remote Sens 51(5):2874–2886. https://doi.org/10.1109/TGRS.2012.2217397

    Article  Google Scholar 

  38. Wang Y, Zhang L, Tong X, Zhang L, Zhang Z, Liu H, Xing X, Mathiopoulos P (2016) A three-layered graph-based learning approach for remote sensing image retrieval. IEEE Trans Geosci Remote Sens 54(10):6020–6034. https://doi.org/10.1109/TGRS.2016.2579648

    Article  Google Scholar 

  39. Xia G, Yang W, Delon J, Gousseau Y, Sun H, Maitre H (2010) Structrual high-resolution satellite image indexing. In: ISPRS TC VII Symposium-100 years ISPRS 38, pp 298–303

  40. Yan C, Zhang Y, Dai F, Zhang J, Li L, Dai Q (2014) Efficient parallel HEVC intra-prediction on many-core processor. Electron Lett 50(11):805–806. https://doi.org/10.1049/el.2014.0611

    Article  Google Scholar 

  41. Yang Y, Newsam S (2010) Bag-of-visual-words and spatial extensions for land-use classification. In: 18th ACM SIGSPATIAL international conference on advances in geographic information systems, San Jose, California, pp 270–279

  42. Yang Y, Newsam S (2013) Geographic image retrieval using local invariant features. IEEE Trans Geosci Remote Sens 51(2):818–832. https://doi.org/10.1109/TGRS.2012.2205158

    Article  Google Scholar 

  43. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: 13th european conference on computer vision, Zurich, Switzerland, pp 818–833. https://doi.org/10.1007/978-3-319-10590-1_53

  44. Zheng L, Zhao Y, Wang S, Wang J, Tian Q (2016) Good practice in cnn feature transfer. arXiv:1604.00133v1

  45. Zhou W, Li C (2016) Deep feature representations for high-resolution remote-sensing imagery retrieval. arXiv:1610.03023

  46. Zhou W, Newsam S, Li C, Shao Z (2017) Learning low dimensional convolutional neural networks for high-resolution remote sensing image retrieval. Remote Sens 9(5):489. https://doi.org/10.3390/rs9050489

    Article  Google Scholar 

  47. Zhou W, Newsam S, Li C, Shao Z (2017) Patternnet: a benchmark dataset for performance evaluation of remote sensing image retrieval. arXiv:1706.03424

Download references

Acknowledgements

This work has been supported by National Natural Science Foundation of China [grant numbers 41261091, 61662044, 61663031, and 61762067].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Famao Ye.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ge, Y., Jiang, S., Xu, Q. et al. Exploiting representations from pre-trained convolutional neural networks for high-resolution remote sensing image retrieval. Multimed Tools Appl 77, 17489–17515 (2018). https://doi.org/10.1007/s11042-017-5314-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-5314-5

Keywords

Navigation