Skip to main content

Landmark Recognition: From Small-Scale to Large-Scale Retrieval

  • Chapter
  • First Online:
Recent Advances in Computer Vision

Part of the book series: Studies in Computational Intelligence ((SCI,volume 804))

Abstract

During the last years, the problem of landmark recognition is addressed in many different ways. Landmark recognition is related to finding the most similar images to a starting one in a particular dataset of buildings or places. This chapter explains the most used techniques for solving the problem of landmark recognition, with a specific focus on techniques based on deep learning. Firstly, the focus is on the classical approaches for the creation of descriptors used in the content-based image retrieval task. Secondly, the deep learning approach that has shown overwhelming improvements in many tasks of computer vision, is presented. A particular attention is put on the major recent breakthroughs in Content-Based Image Retrieval (CBIR), the first one is transfer learning which improves the feature representation and therefore accuracy of the retrieval system. The second one is the fine-tuning technique, that allows to highly improve the performance of the retrieval system, is presented. Finally, the chapter exposes the techniques for large-scale retrieval, in which datasets contain at least a million images.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://people.csail.mit.edu/torralba/shortCourseRLOC/.

References

  1. Bhattacharya, P., Gavrilova, M.: A survey of landmark recognition using the bag-of-words framework. In: Intelligent Computer Graphics 2012, pp. 243–263. Springer (2013)

    Google Scholar 

  2. Liu, Y., Zhang, D., Lu, G., Ma, W.Y.: A survey of content-based image retrieval with high-level semantics. Pattern Recogn. 40(1), 262–282 (2007)

    Article  MATH  Google Scholar 

  3. Hare, J.S., Lewis, P.H., Enser, P.G., Sandom, C.J.: Mind the gap: another look at the problem of the semantic gap in image retrieval. In: Multimedia Content Analysis, Management, and Retrieval 2006, vol. 6073, p. 607309. International Society for Optics and Photonics (2006)

    Google Scholar 

  4. Muneesawang, P., Zhang, N., Guan, L.: Mobile landmark recognition. In: Multimedia Database Retrieval, pp. 131–145. Springer (2014)

    Google Scholar 

  5. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  MathSciNet  Google Scholar 

  6. Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: European Conference on Computer Vision, pp. 404–417. Springer (2006)

    Google Scholar 

  7. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556

  8. Zheng, L., Yang, Y., Tian, Q.: SIFT meets CNN: a decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 40(5), 1224–1244 (2017)

    Article  Google Scholar 

  9. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)

    Google Scholar 

  10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  11. Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: International Conference on Computer Vision, vol. 2, pp. 1470–1477 (2003)

    Google Scholar 

  12. Jégou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: European Conference on Computer Vision, pp. 304–317 (2008)

    Google Scholar 

  13. Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3304–3311 (2010)

    Google Scholar 

  14. Perronnin, F., Liu, Y., Sánchez, J., Poirier, H.: Large-scale image retrieval with compressed fisher vectors. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3384–3391. IEEE (2010)

    Google Scholar 

  15. Jégou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1169–1176. IEEE (2009)

    Google Scholar 

  16. Tolias, G., Avrithis, Y., Jégou, H.: To aggregate or not to aggregate: selective match kernels for image search. In: IEEE International Conference on Computer Vision, pp. 1401–1408 (2013)

    Google Scholar 

  17. Jégou, H., Chum, O.: Negative evidences and co-occurences in image retrieval: the benefit of PCA and whitening. In: European Conference on Computer Vision, pp. 774–787 (2012)

    Chapter  Google Scholar 

  18. Arandjelovic, R., Zisserman, A.: All about VLAD. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1578–1585 (2013)

    Google Scholar 

  19. Wang, Z., Di, W., Bhardwaj, A., Jagadeesh, V., Piramuthu, R.: Geometric VLAD for large scale image search. In: International Conference on Machine Learning (2014)

    Google Scholar 

  20. Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: European Conference on Computer Vision, pp. 143–156 (2010)

    Chapter  Google Scholar 

  21. Zhao, W.L., Jégou, H., Gravier, G.: Oriented pooling for dense and non-dense rotation-invariant features. In: British Machine Vision Conference (2013)

    Google Scholar 

  22. Zhou, Q., Wang, C., Liu, P., Li, Q., Wang, Y., Chen, S.: Distribution entropy boosted VLAD for Image Retrieval. Entropy (2016)

    Google Scholar 

  23. Eggert, C., Romberg, S., Lienhart, R.: Improving VLAD: hierarchical coding and a refined local coordinate system. In: International Conference on Image Processing (2014)

    Google Scholar 

  24. Liu, Z., Wang, S., Tian, Q.: Fine-residual VLAD for image retrieval. Neurocomputing 173, 1183–1191 (2016)

    Article  Google Scholar 

  25. Magliani, F., Bidgoli, N.M., Prati, A.: A location-aware embedding technique for accurate landmark recognition. In: International Conference on Distributed Smart Cameras (2017)

    Google Scholar 

  26. Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5297–5307 (2016)

    Google Scholar 

  27. Yue-Hei Ng, J., Yang, F., Davis, L.S.: Exploiting local features from deep networks for image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 53–61 (2015)

    Google Scholar 

  28. Yan, K., Wang, Y., Liang, D., Huang, T., Tian, Y.: CNN vs. SIFT for image retrieval: alternative or complementary? In: ACM Multimedia Conference, pp. 407–411. ACM (2016)

    Google Scholar 

  29. Reddy Mopuri, K., Venkatesh Babu, R.: Object level deep feature pooling for compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 62–70 (2015)

    Google Scholar 

  30. Tolias, G., Sicre, R., Jégou, H.: Particular object retrieval with integral max-pooling of CNN activations. In: International Conference on Learning Representation (2015)

    Google Scholar 

  31. Gordo, A., Almazán, J., Revaud, J., Larlus, D.: Deep image retrieval: learning global representations for image search. In: European Conference on Computer Vision, pp. 241–257. Springer (2016)

    Google Scholar 

  32. Seddati, O., Dupont, S., Mahmoudi, S., Parian, M., Dolez, B.: Towards good practices for image retrieval based on CNN features. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1246–1255 (2017)

    Google Scholar 

  33. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  34. Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: European Conference on Computer Vision, pp. 584–599. Springer (2014)

    Google Scholar 

  35. Gordo, A., Almazan, J., Revaud, J., Larlus, D.: End-to-end learning of deep visual representations for image retrieval. Int. J. Comput. Vis. 124(2), 237–254 (2017)

    Article  MathSciNet  Google Scholar 

  36. Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: International Conference on Computational Statistics, pp. 177–186. Springer (2010)

    Google Scholar 

  37. Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207–244 (2009)

    Google Scholar 

  38. Chavez, E., Figueroa, K., Navarro, G.: Effective proximity retrieval by ordering permutations. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1647–1658 (2008)

    Article  Google Scholar 

  39. Amato, G., Gennaro, C., Savino, P.: Mi-file: using inverted files for scalable approximate similarity search. Multimed. Tools Appl. 71(3), 1333–1362 (2014)

    Article  Google Scholar 

  40. Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: ACM Symposium on Theory of Computing, pp. 604–613. ACM (1998)

    Google Scholar 

  41. Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Symposium on Computational Geometry, pp. 253–262. ACM (2004)

    Google Scholar 

  42. Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: International Conference on Very Large Data Bases, pp. 950–961. VLDB Endowment (2007)

    Google Scholar 

  43. Wang, J., Shen, H.T., Song, J., Ji, J.: Hashing for similarity search: a survey (2014). arXiv:1408.2927

  44. Magliani, F., Fontanini, T., Prati, A.: Efficient nearest neighbors search for large-scale landmark recognition (2018). arXiv:1806.05946

    Chapter  Google Scholar 

  45. Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)

    Article  Google Scholar 

  46. Kalantidis, Y., Mellina, C., Osindero, S.: Cross-dimensional weighting for aggregated deep convolutional features. In: European Conference on Computer Vision, pp. 685–701. Springer (2016)

    Google Scholar 

  47. Ge, T., He, K., Ke, Q., Sun, J.: Optimized product quantization. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 744–755 (2014)

    Article  Google Scholar 

  48. Muja, M., Lowe, D.G.: Scalable nearest neighbor algorithms for high dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2227–2240 (2014)

    Article  Google Scholar 

  49. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)

    Google Scholar 

  50. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)

    Google Scholar 

  51. Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2161–2168. IEEE (2006)

    Google Scholar 

  52. Huiskes, M.J., Lew, M.S.: The MIR flickr retrieval evaluation. In: ACM International Conference on Multimedia Information Retrieval, pp. 39–43. ACM (2008)

    Google Scholar 

  53. Laskar, Z., Kannala, J.: Context aware query image representation for particular object retrieval. In: Scandinavian Conference on Image Analysis, pp. 88–99. Springer (2017)

    Google Scholar 

  54. Kalantidis, Y., Avrithis, Y.: Locally optimized product quantization for approximate nearest neighbor search. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2321–2328 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Federico Magliani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Magliani, F., Fontanini, T., Prati, A. (2019). Landmark Recognition: From Small-Scale to Large-Scale Retrieval. In: Hassaballah, M., Hosny, K. (eds) Recent Advances in Computer Vision. Studies in Computational Intelligence, vol 804. Springer, Cham. https://doi.org/10.1007/978-3-030-03000-1_10

Download citation

Publish with us

Policies and ethics