Landmark Recognition: From Small-Scale to Large-Scale Retrieval

Magliani, Federico; Fontanini, Tomaso; Prati, Andrea

doi:10.1007/978-3-030-03000-1_10

Federico Magliani⁴,
Tomaso Fontanini⁴ &
Andrea Prati⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 804))

1246 Accesses
3 Citations

Abstract

During the last years, the problem of landmark recognition is addressed in many different ways. Landmark recognition is related to finding the most similar images to a starting one in a particular dataset of buildings or places. This chapter explains the most used techniques for solving the problem of landmark recognition, with a specific focus on techniques based on deep learning. Firstly, the focus is on the classical approaches for the creation of descriptors used in the content-based image retrieval task. Secondly, the deep learning approach that has shown overwhelming improvements in many tasks of computer vision, is presented. A particular attention is put on the major recent breakthroughs in Content-Based Image Retrieval (CBIR), the first one is transfer learning which improves the feature representation and therefore accuracy of the retrieval system. The second one is the fine-tuning technique, that allows to highly improve the performance of the retrieval system, is presented. Finally, the chapter exposes the techniques for large-scale retrieval, in which datasets contain at least a million images.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://people.csail.mit.edu/torralba/shortCourseRLOC/.

References

Bhattacharya, P., Gavrilova, M.: A survey of landmark recognition using the bag-of-words framework. In: Intelligent Computer Graphics 2012, pp. 243–263. Springer (2013)
Google Scholar
Liu, Y., Zhang, D., Lu, G., Ma, W.Y.: A survey of content-based image retrieval with high-level semantics. Pattern Recogn. 40(1), 262–282 (2007)
Article MATH Google Scholar
Hare, J.S., Lewis, P.H., Enser, P.G., Sandom, C.J.: Mind the gap: another look at the problem of the semantic gap in image retrieval. In: Multimedia Content Analysis, Management, and Retrieval 2006, vol. 6073, p. 607309. International Society for Optics and Photonics (2006)
Google Scholar
Muneesawang, P., Zhang, N., Guan, L.: Mobile landmark recognition. In: Multimedia Database Retrieval, pp. 131–145. Springer (2014)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article MathSciNet Google Scholar
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: European Conference on Computer Vision, pp. 404–417. Springer (2006)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556
Zheng, L., Yang, Y., Tian, Q.: SIFT meets CNN: a decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 40(5), 1224–1244 (2017)
Article Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: International Conference on Computer Vision, vol. 2, pp. 1470–1477 (2003)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: European Conference on Computer Vision, pp. 304–317 (2008)
Google Scholar
Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3304–3311 (2010)
Google Scholar
Perronnin, F., Liu, Y., Sánchez, J., Poirier, H.: Large-scale image retrieval with compressed fisher vectors. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3384–3391. IEEE (2010)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1169–1176. IEEE (2009)
Google Scholar
Tolias, G., Avrithis, Y., Jégou, H.: To aggregate or not to aggregate: selective match kernels for image search. In: IEEE International Conference on Computer Vision, pp. 1401–1408 (2013)
Google Scholar
Jégou, H., Chum, O.: Negative evidences and co-occurences in image retrieval: the benefit of PCA and whitening. In: European Conference on Computer Vision, pp. 774–787 (2012)
Chapter Google Scholar
Arandjelovic, R., Zisserman, A.: All about VLAD. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1578–1585 (2013)
Google Scholar
Wang, Z., Di, W., Bhardwaj, A., Jagadeesh, V., Piramuthu, R.: Geometric VLAD for large scale image search. In: International Conference on Machine Learning (2014)
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: European Conference on Computer Vision, pp. 143–156 (2010)
Chapter Google Scholar
Zhao, W.L., Jégou, H., Gravier, G.: Oriented pooling for dense and non-dense rotation-invariant features. In: British Machine Vision Conference (2013)
Google Scholar
Zhou, Q., Wang, C., Liu, P., Li, Q., Wang, Y., Chen, S.: Distribution entropy boosted VLAD for Image Retrieval. Entropy (2016)
Google Scholar
Eggert, C., Romberg, S., Lienhart, R.: Improving VLAD: hierarchical coding and a refined local coordinate system. In: International Conference on Image Processing (2014)
Google Scholar
Liu, Z., Wang, S., Tian, Q.: Fine-residual VLAD for image retrieval. Neurocomputing 173, 1183–1191 (2016)
Article Google Scholar
Magliani, F., Bidgoli, N.M., Prati, A.: A location-aware embedding technique for accurate landmark recognition. In: International Conference on Distributed Smart Cameras (2017)
Google Scholar
Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5297–5307 (2016)
Google Scholar
Yue-Hei Ng, J., Yang, F., Davis, L.S.: Exploiting local features from deep networks for image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 53–61 (2015)
Google Scholar
Yan, K., Wang, Y., Liang, D., Huang, T., Tian, Y.: CNN vs. SIFT for image retrieval: alternative or complementary? In: ACM Multimedia Conference, pp. 407–411. ACM (2016)
Google Scholar
Reddy Mopuri, K., Venkatesh Babu, R.: Object level deep feature pooling for compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 62–70 (2015)
Google Scholar
Tolias, G., Sicre, R., Jégou, H.: Particular object retrieval with integral max-pooling of CNN activations. In: International Conference on Learning Representation (2015)
Google Scholar
Gordo, A., Almazán, J., Revaud, J., Larlus, D.: Deep image retrieval: learning global representations for image search. In: European Conference on Computer Vision, pp. 241–257. Springer (2016)
Google Scholar
Seddati, O., Dupont, S., Mahmoudi, S., Parian, M., Dolez, B.: Towards good practices for image retrieval based on CNN features. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1246–1255 (2017)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: European Conference on Computer Vision, pp. 584–599. Springer (2014)
Google Scholar
Gordo, A., Almazan, J., Revaud, J., Larlus, D.: End-to-end learning of deep visual representations for image retrieval. Int. J. Comput. Vis. 124(2), 237–254 (2017)
Article MathSciNet Google Scholar
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: International Conference on Computational Statistics, pp. 177–186. Springer (2010)
Google Scholar
Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207–244 (2009)
Google Scholar
Chavez, E., Figueroa, K., Navarro, G.: Effective proximity retrieval by ordering permutations. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1647–1658 (2008)
Article Google Scholar
Amato, G., Gennaro, C., Savino, P.: Mi-file: using inverted files for scalable approximate similarity search. Multimed. Tools Appl. 71(3), 1333–1362 (2014)
Article Google Scholar
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: ACM Symposium on Theory of Computing, pp. 604–613. ACM (1998)
Google Scholar
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Symposium on Computational Geometry, pp. 253–262. ACM (2004)
Google Scholar
Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: International Conference on Very Large Data Bases, pp. 950–961. VLDB Endowment (2007)
Google Scholar
Wang, J., Shen, H.T., Song, J., Ji, J.: Hashing for similarity search: a survey (2014). arXiv:1408.2927
Magliani, F., Fontanini, T., Prati, A.: Efficient nearest neighbors search for large-scale landmark recognition (2018). arXiv:1806.05946
Chapter Google Scholar
Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)
Article Google Scholar
Kalantidis, Y., Mellina, C., Osindero, S.: Cross-dimensional weighting for aggregated deep convolutional features. In: European Conference on Computer Vision, pp. 685–701. Springer (2016)
Google Scholar
Ge, T., He, K., Ke, Q., Sun, J.: Optimized product quantization. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 744–755 (2014)
Article Google Scholar
Muja, M., Lowe, D.G.: Scalable nearest neighbor algorithms for high dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2227–2240 (2014)
Article Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)
Google Scholar
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2161–2168. IEEE (2006)
Google Scholar
Huiskes, M.J., Lew, M.S.: The MIR flickr retrieval evaluation. In: ACM International Conference on Multimedia Information Retrieval, pp. 39–43. ACM (2008)
Google Scholar
Laskar, Z., Kannala, J.: Context aware query image representation for particular object retrieval. In: Scandinavian Conference on Image Analysis, pp. 88–99. Springer (2017)
Google Scholar
Kalantidis, Y., Avrithis, Y.: Locally optimized product quantization for approximate nearest neighbor search. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2321–2328 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

IMPLab, Department of Engineering and Architecture, University of Parma, Parma, Italy
Federico Magliani, Tomaso Fontanini & Andrea Prati

Authors

Federico Magliani
View author publications
You can also search for this author in PubMed Google Scholar
Tomaso Fontanini
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Prati
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Federico Magliani .

Editor information

Editors and Affiliations

Department of Computer Science, South Valley University, Luxor, Egypt
Mahmoud Hassaballah
Department of Information Technology, Zagazig University, Zagazig, Egypt
Khalid M. Hosny

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Magliani, F., Fontanini, T., Prati, A. (2019). Landmark Recognition: From Small-Scale to Large-Scale Retrieval. In: Hassaballah, M., Hosny, K. (eds) Recent Advances in Computer Vision. Studies in Computational Intelligence, vol 804. Springer, Cham. https://doi.org/10.1007/978-3-030-03000-1_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-03000-1_10
Published: 15 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02999-9
Online ISBN: 978-3-030-03000-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics