Memory efficient large-scale image-based localization

Lu, Guoyu; Sebe, Nicu; Xu, Congfu; Kambhamettu, Chandra

doi:10.1007/s11042-014-1977-3

Memory efficient large-scale image-based localization

Published: 08 May 2014

Volume 74, pages 479–503, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Guoyu Lu¹,
Nicu Sebe²,
Congfu Xu³ &
…
Chandra Kambhamettu¹

384 Accesses
14 Citations
Explore all metrics

Abstract

Local features have been widely used in the area of image-based localization. However, large-scale 2D-to-3D matching problems still involve massive memory consumption, which is mainly caused by the high dimensionality of the features (e.g. 128 dimensions of SIFT feature). This paper introduces a new method that decreases local features’ high dimensionality for reducing memory capacity and accelerating the descriptor matching process. With this new method, all descriptors are projected into a lower dimensional space through the new learned matrices that are able to reduce the curse of dimensionality in the large scale image-based localization. The low dimensional descriptors are then mapped into a Hamming space for further reducing the memory requirement. This study also proposes an image-based localization pipeline based on the new learned Hamming descriptors. The new learned descriptor and the localization pipeline are applied to two challenging datasets. The experimental results show that the proposed method achieves extraordinary image registration performance compared with the published results from state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Is Geometry Enough for Matching in Visual Localization?

Exploiting Spatial and Co-visibility Relations for Image-Based Localization

Dynamic-scale grid structure with weighted-scoring strategy for fast feature matching

Article 14 January 2022

References

Arandjelovic R, Zisserman A (2012) Three things everyone should know to improve object retrieval. In: Proceedings of 2012 IEEE conference on computer vision and pattern recognition (CVPR). pp 2911–2918
Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (surf). J Comput Vis Image Underst (CVIU) 110(3):346–359
Article Google Scholar
Beltran A, Abargues C, Granell C, Núñez M, Díaz L, Huerta J (2013) A virtual globe tool for searching and visualizing geo-referenced media resources in social networks. Multimed Tools Appl (JMTA):1–25
Broder A (1997) On the resemblance and containment of documents. In: Proceedings of compression and complexity of sequences. pp 21–29
Broder A, Charikar M, Frieze A, Mitzenmacher M (1998) Min-wise independent permutations. J Comput Syst Sci 60:327–336
MathSciNet Google Scholar
Brown M, Hua G, Winder S (2011) Discriminative learning of local image descriptors. IEEE Trans Patt Anal Mach Intell (TPAMI) 33(1):43–57
Article Google Scholar
Castle R, Klein G, Murray D (2008) Video-rate localization in multiple maps for wearable augmented reality. In: Proceedings of the 2008 12th IEEE international symposium on wearable computers (ISWC). pp 15–22
Crandall D, Owens A, Snavely N, Huttenlocher D (2011) Discrete-continuous optimization for large-scale structure from motion. In: Proceedings of the 2011 IEEE conference on computer vision and pattern recognition (CVPR). pp. 3001–3008
Cummins M, Newman P (2008) Fab-map: probabilistic localization and mapping in the space of appearance. Int J Robot Res(IJRR) 27(6):647–665
Article Google Scholar
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Article MathSciNet Google Scholar
Frahm J, Georgel P, Gallup D, Johnson T, Raguram R, Wu C, Jen Y, Dunn E, Clipp B, Lazebnik S, Pollefeys M (2010) Building Rome on a cloudless day. In: Proceedings of the 11th European conference on computer vision (ECCV). pp 368–381
Gao Y, Wang M, Zha Z, Shen J, Li X, Wu X (2013) Visual-textual joint relevance learning for tag-based social image search. IEEE Trans Image Process (TIP) 22(1):363–376
Article MathSciNet Google Scholar
Gao Y, Wang M, Zha Z, Tian Q, Dai Q, Zhang N (2011) Less is more: efficient 3-d object retrieval with query view selection. IEEE Trans Multimed (TMM) 13(5):1007–1018
Article Google Scholar
Han Y, Wu F, Tao D, Shao J, Zhuang Y, Jiang J (2012) Sparse unsupervised dimensionality reduction for multiple view data. IEEE Trans Circ Syst Video Tech 22(10):1485–1496
Article Google Scholar
Han Y, Yang Y, Zhou X (2013) Co-regularized ensemble for feature selection. In: Proceedings of the 23rd international joint conference on artificial intelligence (IJCAI)
Hartley R, Zisserman A (2004) Multiple view geometry in computer vision. Cambridge University Press. ISBN: 0521540518
Heath K, Gelfand N, Ovsjanikov M, Aanjaneya M, Guibas L (2010) Image webs: computing and exploiting connectivity in image collections. In: Proceedings of the 2010 IEEE conference on computer vision and pattern recognition (CVPR). pp 3432–3439
Hua G, Brown M, Winder S (2007) Discriminant embedding for local image descriptors. In: Proceedings of the 2007 IEEE 11th international conference on computer vision (ICCV). pp 1–8
Irschara A, Zach C, Frahm J, Bischof H (2009) From structure-from-motion point clouds to fast location recognition. In: Proceedings of the 2009 IEEE computer society conference on computer vision and pattern recognition (CVPR). pp 2599–2606
Jacobs N, Miskell K, Pless R (2011) Webcam geo-localization using aggregate light levels. In: Proceedings of 2011 IEEE workshops on applications of computer vision (WACV). pp 132–138
Jolliffe I (1986) Principal component analysis. Springer Verlag
Kalia R, Lee KD, Samir B, Je SK, Oh WG (2011) An analysis of the effect of different image preprocessing techniques on the performance of surf: speeded up robust features. In: Proceedings of the 2011 17th Korea-Japan joint workshop on frontiers of computer vision. pp 1–6
Ke Y, Sukthankar R (2004) Pca-sift: a more distinctive representation for local image descriptors. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2. pp 506–513
Kulis B, Darrell T (2009) Learning to hash with binary reconstructive embeddings. In: Proceedings of the 23nd annual conference on neural information processing systems (NIPS). pp 1042–1050
Kulis B, Grauman K (2009) Kernelized locality-sensitive hashing for scalable image search. In: Proceedings of the 2009 IEEE 12th international conference on computer vision (ICCV). pp 2130–2137
Leonard J, Durrant-Whyte H (1991) Simultaneous map building and localization for an autonomous mobile robot. In: Proceedings of the 1991 IEEE/RSJ international workshop on intelligent robots and systems ’91. ’Intelligence for mechanical systems, vol 3. pp 1442–1447
Li Y, Snavely N, Huttenlocher DP (2010) Location recognition using prioritized feature matching. In: Proceedings of the 11th European conference on computer vision (ECCV). pp 791–804
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis (IJCV) 60(2):91–110
Article Google Scholar
Ma Z, Yang Y, Cai Y, Sebe N, Hauptmann A (2012) Knowledge adaptation for ad hoc multimedia event detection with few exemplars. In: Proceedings of the 20th ACM international conference on multimedia (MM). pp 469–478
Ma Z, Yang Y, Sebe N, Hauptmann A (2014) Knowledge adaptation with partially shared features for event detection using few exemplars. In: IEEE transactions on pattern analysis and machine intelligence. 10.1109/TPAMI.2014.2306419
Mika S, Ratsch G, Weston J, Scholkopf B, Mullers K (1999) Fisher discriminant analysis with kernels. In: Proceedings of the 1999 IEEE signal processing society workshop neural networks for signal processing IX. pp 41–48
Muja M, Lowe D (2009) Fast approximate nearest neighbors with automatic algorithm configuration. In: Proceedings of the 2009 international conference on computer vision theory and applications (VISAPP). pp 331–340
Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: Proceedings of the 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2. pp 2161–2168
Philbin J, Isard M, Sivic J, Zisserman A (2010) Descriptor learning for efficient retrieval. In: Proceedings of the 11th European conference on computer vision conference on Computer vision (ECCV). pp 677–691
Powell M (1964) An efficient method for finding the minimum of a function of several variables without calculating derivatives. Comput J 7:155–162
Article MATH MathSciNet Google Scholar
Raginsky M, Lazebnik S (2009) Locality-sensitive binary codes from shift-invariant kernels. In: Proceedings of the 22nd annual conference on neural information processing systems (NIPS). pp 1509–1517
Robertson D, Cipolla R (2004) An image-based system for urban navigation. In: Proceedings of the 2004 British machine vision conference (BMVC). pp 819–828
Sattler T, Leibe B, Kobbelt L (2011) Fast image-based localization using direct 2d-to-3d matching. In: Proceedings of the 2011 IEEE international conference on computer vision (ICCV). pp 667–674
Schindler G, Brown M, Szeliski R (2007) City-scale location recognition. In: Proceedings of the 2007 IEEE conference on computer vision and pattern recognition (CVPR). pp 1–7
Shao H, Svoboda T, Tuytelaars T, Van Gool L (2003) Hpat indexing for fast object/scene recognition based on local appearance. In: Proceedings of the 2003 international conference on image and video retrieval (CIVR). pp 71–80
Smith R, Cheeseman P (1986) On the representation and estimation of spatial uncertainty. Int J Robot Res (IJRR) 5(6):56–68
Article Google Scholar
Snavely N, Seitz S, Szeliski R (2006) Photo tourism: exploring photo collections in 3d. ACM Transit Graph 25(3):835–846
Article Google Scholar
Steinhoff U, Dusan O, Perko R, Schiele B, Leonardis A (2007) How computer vision can help in outdoor positioning. In: Proceedings of the 2007 European conference on ambient intelligence (AmI). pp 124–141
Strecha C, Bronstein A, Bronstein M, Fua P (2012) LDAHash: improved matching with smaller descriptors. IEEE Trans Patt Anal Mach Intell (TPAMI) 34:66–78
Article Google Scholar
Tola E, Lepetit V, Fua P (2010) Daisy: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans Patt Anal Mach Intell (TPAMI) 32(5):815–830
Article Google Scholar
Wang H, Yan S, Xu D, Tang X, Huang T (2007) Trace ratio vs. ratio trace for dimensionality reduction. In: IEEE conference on computer vision and pattern recognition (CVPR). pp 1–8
Wang M, Gao Y, Lu K, Rui Y (2013) View-based discriminative probabilistic modeling for 3d object retrieval and recognition. IEEE Trans Image Process (TIP) 22(4):1395–1407
Article MathSciNet Google Scholar
Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. In: Proceedings of the 22nd annual conference on neural information processing systems (NIPS). pp 1753–1760
Wendel A, Irschara A, Bischof H (2011) Natural landmark-based monocular localization for mavs. In: Proceedings of the 2011 IEEE international conference on robotics and automation (ICRA). pp 5792–5799
Winder S, Hua G, Brown M (2009) Picking the best daisy. In: Proceedings of the 2009 IEEE conference on computer vision and pattern recognition (CVPR). pp 178–185
Xiao J, Chen J, Yeung D, Quan L (2008) Structuring visual words in 3d for arbitrary-view object localization. In: Proceedings of the 10th European conference on computer vision (ECCV). pp 725–737
Xuan K, Zhao G, Taniar D, Safar M, Srinivasan B (2011) Voronoi-based multi-level range search in mobile navigation. Multimed Tools Appl (JMTA) 53(2):459–479
Article Google Scholar
Yagnik J, Strelow D, Ross DA, Lin RS (2011) The power of comparative reasoning. In: Proceedings of the 2011 IEEE international conference on computer vision (ICCV). pp 2431–2438
Yang Y, Nie F, Luo J, Zhuang Y, Pan, Y (2012) A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Trans Patt Anal Mach Intell (TPAMI) 34:723–742
Article Google Scholar
Yang Y, Zhuang Y, Wu F, YH, P (2008) Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Trans Multimed (TMM) 10:437–446
Article Google Scholar
Yu S, Yang Y, Hauptmann A (2013) Harry potter’s marauder’s map: localizing and tracking multiple persons-of-interest by nonnegative discretization. In: Proceedings of 2013 IEEE conference on computer vision and pattern recognition (CVPR)
Zhang W, Kosecka J (2006) Image based localization in urban environments. In: Proceedings of the 3rd international symposium on 3D data processing, visualization, and transmission (3DPVT). pp 33–40

Download references

Acknowledgements

This work has been financially supported by European Master in Informatics program, RWTH Aachen University, University of Trento and the PhD program of University of Delaware. The authors are grateful to Torsten Sattler and Leif Kobbelt from RWTH Aachen University for their great help to make this work accomplished.

Author information

Authors and Affiliations

Video/Image Modeling and Synthesis Lab, University of Delaware, Newark, DE, 19711, USA
Guoyu Lu & Chandra Kambhamettu
Department of Information Engineering and Computer Science, University of Trento, 38100, Trento, Italy
Nicu Sebe
Institute of Artificial Intelligence, Zhejiang University, Hangzhou, 310027, People’s Republic of China
Congfu Xu

Authors

Guoyu Lu
View author publications
You can also search for this author in PubMed Google Scholar
Nicu Sebe
View author publications
You can also search for this author in PubMed Google Scholar
Congfu Xu
View author publications
You can also search for this author in PubMed Google Scholar
Chandra Kambhamettu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guoyu Lu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, G., Sebe, N., Xu, C. et al. Memory efficient large-scale image-based localization. Multimed Tools Appl 74, 479–503 (2015). https://doi.org/10.1007/s11042-014-1977-3

Download citation

Published: 08 May 2014
Issue Date: January 2015
DOI: https://doi.org/10.1007/s11042-014-1977-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Memory efficient large-scale image-based localization

Abstract

Access this article

Similar content being viewed by others

Is Geometry Enough for Matching in Visual Localization?

Exploiting Spatial and Co-visibility Relations for Image-Based Localization

Dynamic-scale grid structure with weighted-scoring strategy for fast feature matching

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Memory efficient large-scale image-based localization

Abstract

Access this article

Similar content being viewed by others

Is Geometry Enough for Matching in Visual Localization?

Exploiting Spatial and Co-visibility Relations for Image-Based Localization

Dynamic-scale grid structure with weighted-scoring strategy for fast feature matching

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation