research-article

Discovering Geo-Informative Attributes for Location Recognition and Exploration

Authors:

Changsheng XuAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 11, Issue 1s

Article No.: 19, Pages 1 - 23

https://doi.org/10.1145/2648581

Published: 01 October 2014 Publication History

Abstract

This article considers the problem of automatically discovering geo-informative attributes for location recognition and exploration. The attributes are expected to be both discriminative and representative, which correspond to certain distinctive visual patterns and associate with semantic interpretations. For our solution, we analyze the attribute at the region level. Each segmented region in the training set is assigned a binary latent variable indicating its discriminative capability. A latent learning framework is proposed for discriminative region detection and geo-informative attribute discovery. Moreover, we use user-generated content to obtain the semantic interpretation for the discovered visual attributes. Discriminative and search-based attribute annotation methods are developed for geo-informative attribute interpretation. The proposed approach is evaluated on one challenging dataset including GoogleStreetView and Flickr photos. Experimental results show that (1) geo-informative attributes are discriminative and useful for location recognition; (2) the discovered semantic interpretation is meaningful and can be exploited for further location exploration.

References

[1]

Chao-Yeh Chen and Kristen Grauman. 2011. Clues from the beaten path: Location estimation with bursty sequences of tourist photos. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 1569--1576.

Digital Library

[2]

David M. Chen, Georges Baatz, Kevin Köser, Sam S. Tsai, Ramakrishna Vedantham, Timo Pylvänäinen, Kimmo Roimela, Xin Chen, Jeff Bach, Marc Pollefeys, Bernd Girod, and Radek Grzeszczuk. 2011. City-scale landmark identification on mobile devices. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 737--744.

Digital Library

[3]

David J. Crandall, Lars Backstrom, Daniel P. Huttenlocher, and Jon M. Kleinberg. 2009. Mapping the world's photos. In Proceedings of the 18th International World Wide Web Conference (WWW). 761--770.

Digital Library

[4]

Trinh Minh Tri Do and Thierry Artières. 2009. Large margin training for hidden Markov models with partially observed states. In Proceedings of the 26th International Conference on Machine Learning (ICML). 265--272.

Digital Library

[5]

Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros. 2012. What makes Paris look like Paris&quest; ACM Trans. Graph. 31, 4 (2012), 101.

Digital Library

[6]

Kun Duan, Devi Parikh, David J. Crandall, and Kristen Grauman. 2012. Discovering localized attributes for fine-grained recognition. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 3474--3481.

Digital Library

[7]

Quan Fang, Jitao Sang, and Changsheng Xu. 2013a. GIANT: Geo-informative attributes for location recognition and exploration. In Proceedings of the Conference on ACM Multimedia. 13--22.

Digital Library

[8]

Quan Fang, Jitao Sang, Changsheng Xu, and Ke Lu. 2013b. Paint the city colorfully: Location visualization from multiple themes. In Proceedings of the 19th International Conference on Multimedia Modeling (MMM). 92--105.

[9]

Ali Farhadi, Ian Endres, Derek Hoiem, and David A. Forsyth. 2009. Describing objects by their attributes. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 1778--1785.

[10]

Pedro F. Felzenszwalb, David A. McAllester, and Deva Ramanan. 2008. A discriminatively trained, multiscale, deformable part model. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 1--8.

[11]

Gerald Friedland, Jaeyoung Choi, and Adam Janin. 2011. Video2GPS: A demo of multimodal location estimation on Flickr videos. In Proceedings of the Conference on ACM Multimedia. 833--834.

Digital Library

[12]

Gerald Friedland, Oriol Vinyals, and Trevor Darrell. 2010. Multimodal location estimation. In Proceedings of the Conference on ACM Multimedia. 1245--1252.

Digital Library

[13]

Google Maps. 2014. Barcelona, ESP. Google Maps. http://maps.google.com. (Last accessed Jan 2014.)

[14]

Petr Gronat, Michal Havlena, Josef Sivic, and Tomas Pajdla. 2011. Building streetview datasets for place recognition and city reconstruction. Technical Report CTU-CMP-2011-16. Czech Tech University.

[15]

Qiang Hao, Rui Cai, Xin-Jing Wang, Jiang-Ming Yang, Yanwei Pang, and Lei Zhang. 2009. Generating location overviews with images and tags by mining user-generated travelogues. In Proceedings of the International Conference on Multimedia. 801--804.

Digital Library

[16]

James Hays and Alexei A. Efros. 2008. IM2GPS: Estimating geographic information from a single image. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 1--8.

[17]

Livia Hollenstein and Ross Purves. 2010. Exploring place through user-generated content: Using Flickr tags to describe city cores. J. Spatial Inform. Sci. 1, 1 (2010), 21--48.

[18]

Alexander Jaffe, Mor Naaman, Tamir Tassa, and Marc Davis. 2006. Generating summaries and visualization for large collections of geo-referenced photographs. In Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval. 89--98.

Digital Library

[19]

Feng Jing, Lei Zhang, and Wei-Ying Ma. 2006. VirtualTour: An online travel assistant based on high quality images. In Proceedings of the Conference on ACM Multimedia. 599--602.

Digital Library

[20]

Evangelos Kalogerakis, Olga Vesselova, James Hays, Alexei A. Efros, and Aaron Hertzmann. 2009. Image sequence geolocation with human travel priors. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 253--260.

[21]

Lyndon S. Kennedy, Mor Naaman, Shane Ahern, Rahul Nair, and Tye Rattenbury. 2007. How Flickr helps us make sense of the world: Context and content in community-contributed media collections. In Proceedings of the International Conference on Multimedia. 631--640.

Digital Library

[22]

Xiaowei Li, Changchang Wu, Christopher Zach, Svetlana Lazebnik, and Jan-Michael Frahm. 2008. Modeling and recognition of landmark image collections using iconic scene graphs. In Proceedings of the European Conference on Computer Vision (ECCV). 427--440.

Digital Library

[23]

Yunpeng Li, David J. Crandall, and Daniel P. Huttenlocher. 2009. Landmark classification in large-scale image collections. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 1957--1964.

[24]

Tsung-Yi Lin, Serge Belongie, and James Hays. 2013. Cross-view image geolocalization. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 891--898.

Digital Library

[25]

Heng Liu, Tao Mei, Jiebo Luo, Houqiang Li, and Shipeng Li. 2012b. Finding perfect rendezvous on the go: accurate mobile visual localization and its applications to routing. In Proceedings of the 20th ACM International Conference on Multimedia. 9--18.

Digital Library

[26]

Jiajun Liu, Zi Huang, Lei Chen, Heng Tao Shen, and Zhixian Yan. 2012a. Discovering areas of interest with geo-tagged images and check-ins. In Proceedings of the International Conference on Multimedia. 589--598.

Digital Library

[27]

Jiebo Luo, Dhiraj Joshi, Jie Yu, and Andrew C. Gallagher. 2011. Geotagging in multimedia and computer vision - A survey. Multimedia Tools Appl. 51, 1 (2011), 187--211.

Digital Library

[28]

Kevin P. Murphy, Yair Weiss, and Michael I. Jordan. 1999. Loopy belief propagation for approximate inference: An empirical study. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI). 467--475.

Digital Library

[29]

Symeon Papadopoulos, Christos Zigkolis, Stefanos Kapiris, Yiannis Kompatsiaris, and Athena Vakali. 2010. ClustTour: City exploration by use of hybrid photo clustering. In Proceedings of the International Conference on Multimedia. 1617--1620.

Digital Library

[30]

Devi Parikh and Kristen Grauman. 2011. Relative attributes. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 503--510.

Digital Library

[31]

Sobhan Naderi Parizi, John G. Oberlin, and Pedro F. Felzenszwalb. 2012. Reconfigurable models for scene recognition. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 2775--2782.

Digital Library

[32]

Genevieve Patterson and James Hays. 2012. SUN attribute database: Discovering, annotating, and recognizing scene attributes. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 2751--2758.

Digital Library

[33]

Tye Rattenbury and Mor Naaman. 2009. Methods for extracting place semantics from Flickr tags. ACM Trans. Web 3, 1 (2009), 1.

Digital Library

[34]

Jitao Sang, Changsheng Xu, and Jing Liu. 2012. User-aware image tag refinement via ternary semantic analysis. IEEE Trans. Multimedia 14, 3--2 (2012), 883--895.

Digital Library

[35]

Grant Schindler, Matthew Brown, and Richard Szeliski. 2007. City-scale location recognition. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR).

[36]

Pavel Serdyukov, Vanessa Murdock, and Roelof van Zwol. 2009. Placing flickr photos on a map. In Proceedings of the ACM SIGIR Conference. 484--491.

Digital Library

[37]

Jan van Gemert, Cor J. Veenman, Arnold W. M. Smeulders, and Jan-Mark Geusebroek. 2010. Visual word ambiguity. IEEE Trans. Pattern Anal. Mach. Intell. 32, 7 (2010), 1271--1283.

Digital Library

[38]

Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas S. Huang, and Yihong Gong. 2010. Locality-constrained linear coding for image classification. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 3360--3367.

[39]

Xian Xiao, Changsheng Xu, Jinqiao Wang, and Min Xu. 2012. Enhanced 3-D Modeling for landmark image classification. IEEE Trans. Multimedia 14, 4 (2012), 1246--1258.

Digital Library

[40]

Oksana Yakhnenko, Jakob Verbeek, and Cordelia Schmid. 2011. Region-based image classification with a latent SVM model. Rapport de recherche RR-7665. INRIA. http://hal.inria.fr/inria-00605344

[41]

Chun-Nam John Yu and Thorsten Joachims. 2009. Learning structural SVMs with latent variables. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML). 1169--1176.

Digital Library

[42]

Zheng-Jun Zha, Meng Wang, Yan-Tao Zheng, Yi Yang, Richang Hong, and Tat-Seng Chua. 2012. Interactive video indexing with statistical active learning. IEEE Trans. Multimedia 14, 1 (2012), 17--27.

Digital Library

[43]

Zheng-Jun Zha, Linjun Yang, Tao Mei, Meng Wang, and Zengfu Wang. 2009. Visual query suggestion. In Proceedings of the Conference on ACM Multimedia. 15--24.

Digital Library

[44]

Zheng-Jun Zha, Linjun Yang, Tao Mei, Meng Wang, Zengfu Wang, Tat-Seng Chua, and Xian-Sheng Hua. 2010. Visual query suggestion: Towards capturing user intent in internet image search. ACM Trans. Multimedia Comput. Commun. Appl. 6, 3 (2010).

Digital Library

[45]

Zheng-Jun Zha, Hanwang Zhang, Meng Wang, Huan-Bo Luan, and Tat-Seng Chua. 2013. Detecting group activities with multi-camera context. IEEE Trans. Circuits Syst. Video Technol. 23, 5 (2013), 856--869.

Digital Library

[46]

Yantao Zheng, Ming Zhao, Yang Song, Hartwig Adam, Ulrich Buddemeier, Alessandro Bissacco, Fernando Brucher, Tat-Seng Chua, and Hartmut Neven. 2009. Tour the world: Building a web-scale landmark recognition engine. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 1085--1092.

[47]

Yan-Tao Zheng, Zheng-Jun Zha, and Tat-Seng Chua. 2011. Research and applications on georeferenced multimedia: A survey. Multimedia Tools Appl. 51, 1 (2011), 77--98.

Digital Library

[48]

Yan-Tao Zheng, Zheng-Jun Zha, and Tat-Seng Chua. 2012. Mining travel patterns from geotagged photos. ACM Trans. Intell. Syst. Technol. 3, 3 (2012), 56.

Digital Library

[49]

Jianke Zhu, Steven C. H. Hoi, Michael R. Lyu, and Shuicheng Yan. 2008. Near-duplicate keyframe retrieval by nonrigid image matching. In Proceedings of the Conference on ACM Multimedia. 41--50.

Digital Library

Cited By

Zhang XWang SLi ZMa S(2018)Landmark Image Retrieval by Jointing Feature Refinement and Multimodal Classifier LearningIEEE Transactions on Cybernetics10.1109/TCYB.2017.271279848:6(1682-1695)Online publication date: Jun-2018
https://doi.org/10.1109/TCYB.2017.2712798
Zhang XZhao ZZhang HWang SLi Z(2018)Unsupervised geographically discriminative feature learning for landmark taggingKnowledge-Based Systems10.1016/j.knosys.2018.03.005149:C(143-154)Online publication date: 1-Jun-2018
https://dl.acm.org/doi/10.1016/j.knosys.2018.03.005
Zang ALi ZDoria DTrajcevski GGao SHe G(2017)Accurate vehicle self-localization in high definition map datasetProceedings of the 1st ACM SIGSPATIAL Workshop on High-Precision Maps and Intelligent Applications for Autonomous Vehicles10.1145/3149092.3149094(1-8)Online publication date: 7-Nov-2017
https://dl.acm.org/doi/10.1145/3149092.3149094
Show More Cited By

Index Terms

Discovering Geo-Informative Attributes for Location Recognition and Exploration
1. Information systems
  1. Information retrieval
  2. Information systems applications

Recommendations

GIANT: geo-informative attributes for location recognition and exploration
MM '13: Proceedings of the 21st ACM international conference on Multimedia

This paper considers the problem of automatically discovering geo-informative attributes for location recognition and exploration. The attribute is expected to be both discriminative and representative, which corresponds to a distinctive visual pattern ...
Exploiting machine learning techniques for location recognition and prediction with smartphone logs

Due to the advancement of mobile computing technology and the various sensors built in the smartphones, context-aware services are proliferating to everyday life. Location-based service (LBS), which provides the appropriate service to smartphone users ...
Predicting Geo-informative Attributes in Large-Scale Image Collections Using Convolutional Neural Networks
WACV '15: Proceedings of the 2015 IEEE Winter Conference on Applications of Computer Vision

Geographic location is a powerful property for organizing large-scale photo collections, but only a small fraction of online photos are geo-tagged. Most work in automatically estimating geo-tags from image content is based on comparison against models ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 11, Issue 1s

Special Issue on Multiple Sensorial (MulSeMedia) Multimodal Media : Advances and Applications

September 2014

260 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/2675060

Editors:
Ralf Steinmetz
Technische Universität Darmstadt, Germany
,
Gheorghita Ghinea,
Christian Timmerer,
Weisi Lin,
Stephen Gulliver,
Zheng-Jun Zha
Institute of Intelligent Machines, Chinese Academy of Sciences
,
Lei Zhang
Bing Multimedia Search, Microsoft Corporation
,
Max Mühlhäuser
Department of Computer Science, Technische Universität Darmstadt
,
Alan F. Smeaton
Insight Centre for Data Analytics, Dublin City University

Issue’s Table of Contents

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2014

Accepted: 01 June 2014

Revised: 01 June 2014

Received: 01 January 2014

Published in TOMM Volume 11, Issue 1s

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
264
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang XWang SLi ZMa S(2018)Landmark Image Retrieval by Jointing Feature Refinement and Multimodal Classifier LearningIEEE Transactions on Cybernetics10.1109/TCYB.2017.271279848:6(1682-1695)Online publication date: Jun-2018
https://doi.org/10.1109/TCYB.2017.2712798
Zhang XZhao ZZhang HWang SLi Z(2018)Unsupervised geographically discriminative feature learning for landmark taggingKnowledge-Based Systems10.1016/j.knosys.2018.03.005149:C(143-154)Online publication date: 1-Jun-2018
https://dl.acm.org/doi/10.1016/j.knosys.2018.03.005
Zang ALi ZDoria DTrajcevski GGao SHe G(2017)Accurate vehicle self-localization in high definition map datasetProceedings of the 1st ACM SIGSPATIAL Workshop on High-Precision Maps and Intelligent Applications for Autonomous Vehicles10.1145/3149092.3149094(1-8)Online publication date: 7-Nov-2017
https://dl.acm.org/doi/10.1145/3149092.3149094
Salem TWorkman SZhai MJacobs N(2016)Analyzing human appearance as a cue for dating images2016 IEEE Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV.2016.7477678(1-8)Online publication date: Mar-2016
https://doi.org/10.1109/WACV.2016.7477678
Fang QXu CSang JHossain MGhoneim A(2016)Folksonomy-Based Visual Ontology Construction and Its ApplicationsIEEE Transactions on Multimedia10.1109/TMM.2016.252760218:4(702-713)Online publication date: 1-Apr-2016
https://dl.acm.org/doi/10.1109/TMM.2016.2527602
Workman SSouvenir RJacobs N(2015)Wide-Area Image Geolocalization with Aerial Reference ImageryProceedings of the 2015 IEEE International Conference on Computer Vision (ICCV)10.1109/ICCV.2015.451(3961-3969)Online publication date: 7-Dec-2015
https://dl.acm.org/doi/10.1109/ICCV.2015.451
Workman SJacobs N(2015)On the location dependence of convolutional neural network features2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW.2015.7301385(70-78)Online publication date: Jun-2015
https://doi.org/10.1109/CVPRW.2015.7301385
Ding DChu W(2015)Weather-Adaptive Distance Metric for Landmark Image ClassificationProceedings, Part II, of the 16th Pacific-Rim Conference on Advances in Multimedia Information Processing -- PCM 2015 - Volume 931510.1007/978-3-319-24078-7_14(139-148)Online publication date: 16-Sep-2015
https://dl.acm.org/doi/10.1007/978-3-319-24078-7_14

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents