Abstract
In this chapter, we explore the task of global image geolocalization—estimating where on the Earth a photograph was captured. We examine variants of the “im2gps” algorithm using millions of “geotagged” Internet photographs as training data. We first discuss a simple to understand nearest-neighbor baseline. Next, we introduce a lazy-learning approach with more sophisticated features that doubles the performance of the original “im2gps” algorithm. Beyond quantifying geolocalization accuracy, we also analyze (a) how the nonuniform distribution of training data impacts the algorithm (b) how performance compares to baselines such as random guessing and land-cover recognition and (c) whether geolocalization is simply landmark or “instance level” recognition at a large scale. We also show that geolocation estimates can provide the basis for image understanding tasks such as population density estimation or land cover estimation. This work was originally described, in part, in “im2gps” [9] which was the first attempt at global geolocalization using Internet-derived training data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
This value was calculated by counting the number of database photos close enough to each query in the test set. Alternatively, each geolocation guess has an area of 126,663 km\(^2\) and the land area of the Earth is 148,940,000 km\(^2\), suggesting that a truly uniform test set would have a chance guessing accuracy of 0.084 %. Chance is higher for our test set because our database (and thus test set) contain no photographs in some regions of Siberia, Sahara, and Antarctica.
References
G. Baatz, O. Saurer, K.Köser, M. Pollefeys, Large scale visual geo-localization of images in mountainous terrain, In Proceedings of the 12th European Conference on Computer Vision - Volume Part II, (2012), pp. 517–530
M. Bar, The proactive brain: using analogies and associations to generate predictions. Trends Cogn. Sci. 11(7), 280–289 (2007)
S.S. Chris Atkeson, Andrew Moore, Locally weighted learning. AI. Review 11, 11–73 (1997)
O. Chum, J. Philbin, J. Sivic, M. Isard, A. Zisserman, Total recall: Automatic query expansion with a generative feature model for object retrieval, in Proceedings of ICCV, 2007
D. Comaniciu, P. Meer, Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)
D.J. Crandall, L. Backstrom, D. Huttenlocher, J. Kleinberg. Mapping the world’s photos, in WWW ’09: Proceedings of the 18th international conference on World wide web 2009, pp. 761–770, 2009
J. Hays, A. Efros. Where in the world? human and computer geolocation of images, in Vision sciences society meeting, 2009
J. Hays, A.A. Efros. Scene completion using millions of photographs, in ACM Transactions on Graphics (SIGGRAPH 2007), 26(3), 2007
J. Hays, A.A. Efros. im2gps: estimating geographic information from a single image, in CVPR, 2008
D. Hoiem, A. Efros, M. Hebert, Recovering surface layout from an image. Int. J. Comput. Vision. 75(1), 151–172 (2007)
N. Jacobs, S. Satkin, N. Roman, R. Speyer, R. Pless, Geolocating static cameras, in Proceedings, ICCV, 2007
E. Kalogerakis, O. Vesselova, J. Hays, A.A. Efros, A. Hertzmann. Image sequence geolocation with human travel priors, in Proceedings of the IEEE International Conference on Computer Vision (ICCV ’09) (2009)
J. Kosecka, W. Zhang. Video compass, in ECCV ’02: Proceedings of the 7th European Conference on Computer Vision-Part IV, 2002, pp. 476–490
J.-F. Lalonde, D. Hoiem, A.A. Efros, C. Rother, J. Winn, A. Criminisi. Photo clip art. ACM Transactions on Graphics (SIGGRAPH 2007), vol. 26(3) (August 2007)
S. Lazebnik, C. Schmid, J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories, in CVPR (2006)
L.-J. Li, L.F. Fei, What, where and who? classifying events by scene and object recognition, in Proceedings, ICCV, (2007)
T.-Y. Lin, S. Belongie, J. Hays. Cross-view image geolocalization, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Portland, OR, June 2013)
D. Lowe, Object recognition from local scale-invariant features. ICCV 2, 1150–1157 (1999)
J. Luo, D. Joshi, J. Yu, A. Gallagher, Geotagging in multimedia and computer visiona survey. Multime’d Tools Appl. 51, 187–211 (2011)
D. Martin, C. Fowlkes, D. Tal, J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, in Proceedings ICCV (July 2001)
J. Matas, O. Chum, M. Urban, T. Pajdla, Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)
A. Oliva, A. Torralba, Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vision 42(3), 145–175 (2001)
A. Oliva, A. Torralba. Building the gist of a scene: The role of global image features in recognition, in Visual Perception, Progress in Brain Research, 2006, vol. 155
J. Philbin, O. Chum, M. Isard, J. Sivic, A. Zisserman. Object retrieval with large vocabularies and fast spatial matching, in CVPR (2007)
J. Philbin, O. Chum, M. Isard, J. Sivic, A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2008)
T. Quack, B. Leibe, L. Van Gool. World-scale mining of objects and events from community photo collections, in CIVR ’08: Proceedings of the 2008 international conference on Content-based image and video retrieval (2008)
L.W. Renninger, J. Malik, When is scene recognition just texture recognition? Vis. Res. 44, 2301–2311 (2004)
I. Simon, N. Snavely, S.M. Seitz. Scene summarization for online image collections, in Proceedings, ICCV (2007)
J. Sivic, A. Zisserman, Video Google: A text retrieval approach to object matching in videos. ICCV 2, 1470–1477 (2003)
N. Snavely, S.M. Seitz, R. Szeliski, Photo tourism: exploring photo collections in 3d. ACM Trans. Graph. 25(3), 835–846 (2006)
R. Szeliski. “Where am I?”: ICCV 2005 Computer Vision Contest. http://research.microsoft.com/iccv2005/Contest/
W. Thompson, C. Valiquette, B. Bennett, K. Sutherland, Geometric reasoning for map-based localization. Spatial Cogn. Comput 1(3), 291–321 (1999)
A. Torralba, R. Fergus, W.T. Freeman, 80 million tiny images: a large dataset for non-parametric object and scene recognition. IEEE PAMI 30(11), 1958–1970 (2008)
J. Vogel, B. Schiele, Semantic modeling of natural scenes for content-based image retrieval. Int. J. Comput. Vis. 72(2), 133–157 (2007)
J. Xiao, J. Hays, K. Ehinger, A. Oliva, A. Torralba. Sun database: Large-scale scene recognition from abbey to zoo, in CVPR (2010)
H. Zhang, A.C. Berg, M. Maire, J. Malik. Svm-knn: Discriminative nearest neighbor classification for visual category recognition, in CVPR ’06 (2006)
W. Zhang, J. Kosecka. Image based localization in urban environments, in 3DPVT ’06 (2006)
Y. Zheng, M. Zhao, Y. Song, H. Adam, U. Buddemeier, A. Bissacco, F. Brucher, T.-S. Chua, H. Neven. Tour the world: building a web-scale landmark recognition engine, in CVPR (2009)
Acknowledgments
We thank Steve Schlosser, Julio Lopez, and Intel Research Pittsburgh for helping us overcome the logistical and computational challenges of this project. All visualizations and geographic data sources are derived from NASA data. Funding for this work was provided by an NSF fellowship to James Hays and NSF grants CAREER 1149853, CAREER 0546547, and CCF-0541230.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Hays, J., Efros, A.A. (2015). Large-Scale Image Geolocalization. In: Choi, J., Friedland, G. (eds) Multimodal Location Estimation of Videos and Images. Springer, Cham. https://doi.org/10.1007/978-3-319-09861-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-09861-6_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09860-9
Online ISBN: 978-3-319-09861-6
eBook Packages: EngineeringEngineering (R0)