skip to main content
research-article

Discovering Geo-Informative Attributes for Location Recognition and Exploration

Published: 01 October 2014 Publication History

Abstract

This article considers the problem of automatically discovering geo-informative attributes for location recognition and exploration. The attributes are expected to be both discriminative and representative, which correspond to certain distinctive visual patterns and associate with semantic interpretations. For our solution, we analyze the attribute at the region level. Each segmented region in the training set is assigned a binary latent variable indicating its discriminative capability. A latent learning framework is proposed for discriminative region detection and geo-informative attribute discovery. Moreover, we use user-generated content to obtain the semantic interpretation for the discovered visual attributes. Discriminative and search-based attribute annotation methods are developed for geo-informative attribute interpretation. The proposed approach is evaluated on one challenging dataset including GoogleStreetView and Flickr photos. Experimental results show that (1) geo-informative attributes are discriminative and useful for location recognition; (2) the discovered semantic interpretation is meaningful and can be exploited for further location exploration.

References

[1]
Chao-Yeh Chen and Kristen Grauman. 2011. Clues from the beaten path: Location estimation with bursty sequences of tourist photos. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 1569--1576.
[2]
David M. Chen, Georges Baatz, Kevin Köser, Sam S. Tsai, Ramakrishna Vedantham, Timo Pylvänäinen, Kimmo Roimela, Xin Chen, Jeff Bach, Marc Pollefeys, Bernd Girod, and Radek Grzeszczuk. 2011. City-scale landmark identification on mobile devices. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 737--744.
[3]
David J. Crandall, Lars Backstrom, Daniel P. Huttenlocher, and Jon M. Kleinberg. 2009. Mapping the world's photos. In Proceedings of the 18th International World Wide Web Conference (WWW). 761--770.
[4]
Trinh Minh Tri Do and Thierry Artières. 2009. Large margin training for hidden Markov models with partially observed states. In Proceedings of the 26th International Conference on Machine Learning (ICML). 265--272.
[5]
Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros. 2012. What makes Paris look like Paris? ACM Trans. Graph. 31, 4 (2012), 101.
[6]
Kun Duan, Devi Parikh, David J. Crandall, and Kristen Grauman. 2012. Discovering localized attributes for fine-grained recognition. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 3474--3481.
[7]
Quan Fang, Jitao Sang, and Changsheng Xu. 2013a. GIANT: Geo-informative attributes for location recognition and exploration. In Proceedings of the Conference on ACM Multimedia. 13--22.
[8]
Quan Fang, Jitao Sang, Changsheng Xu, and Ke Lu. 2013b. Paint the city colorfully: Location visualization from multiple themes. In Proceedings of the 19th International Conference on Multimedia Modeling (MMM). 92--105.
[9]
Ali Farhadi, Ian Endres, Derek Hoiem, and David A. Forsyth. 2009. Describing objects by their attributes. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 1778--1785.
[10]
Pedro F. Felzenszwalb, David A. McAllester, and Deva Ramanan. 2008. A discriminatively trained, multiscale, deformable part model. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 1--8.
[11]
Gerald Friedland, Jaeyoung Choi, and Adam Janin. 2011. Video2GPS: A demo of multimodal location estimation on Flickr videos. In Proceedings of the Conference on ACM Multimedia. 833--834.
[12]
Gerald Friedland, Oriol Vinyals, and Trevor Darrell. 2010. Multimodal location estimation. In Proceedings of the Conference on ACM Multimedia. 1245--1252.
[13]
Google Maps. 2014. Barcelona, ESP. Google Maps. http://maps.google.com. (Last accessed Jan 2014.)
[14]
Petr Gronat, Michal Havlena, Josef Sivic, and Tomas Pajdla. 2011. Building streetview datasets for place recognition and city reconstruction. Technical Report CTU-CMP-2011-16. Czech Tech University.
[15]
Qiang Hao, Rui Cai, Xin-Jing Wang, Jiang-Ming Yang, Yanwei Pang, and Lei Zhang. 2009. Generating location overviews with images and tags by mining user-generated travelogues. In Proceedings of the International Conference on Multimedia. 801--804.
[16]
James Hays and Alexei A. Efros. 2008. IM2GPS: Estimating geographic information from a single image. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 1--8.
[17]
Livia Hollenstein and Ross Purves. 2010. Exploring place through user-generated content: Using Flickr tags to describe city cores. J. Spatial Inform. Sci. 1, 1 (2010), 21--48.
[18]
Alexander Jaffe, Mor Naaman, Tamir Tassa, and Marc Davis. 2006. Generating summaries and visualization for large collections of geo-referenced photographs. In Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval. 89--98.
[19]
Feng Jing, Lei Zhang, and Wei-Ying Ma. 2006. VirtualTour: An online travel assistant based on high quality images. In Proceedings of the Conference on ACM Multimedia. 599--602.
[20]
Evangelos Kalogerakis, Olga Vesselova, James Hays, Alexei A. Efros, and Aaron Hertzmann. 2009. Image sequence geolocation with human travel priors. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 253--260.
[21]
Lyndon S. Kennedy, Mor Naaman, Shane Ahern, Rahul Nair, and Tye Rattenbury. 2007. How Flickr helps us make sense of the world: Context and content in community-contributed media collections. In Proceedings of the International Conference on Multimedia. 631--640.
[22]
Xiaowei Li, Changchang Wu, Christopher Zach, Svetlana Lazebnik, and Jan-Michael Frahm. 2008. Modeling and recognition of landmark image collections using iconic scene graphs. In Proceedings of the European Conference on Computer Vision (ECCV). 427--440.
[23]
Yunpeng Li, David J. Crandall, and Daniel P. Huttenlocher. 2009. Landmark classification in large-scale image collections. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 1957--1964.
[24]
Tsung-Yi Lin, Serge Belongie, and James Hays. 2013. Cross-view image geolocalization. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 891--898.
[25]
Heng Liu, Tao Mei, Jiebo Luo, Houqiang Li, and Shipeng Li. 2012b. Finding perfect rendezvous on the go: accurate mobile visual localization and its applications to routing. In Proceedings of the 20th ACM International Conference on Multimedia. 9--18.
[26]
Jiajun Liu, Zi Huang, Lei Chen, Heng Tao Shen, and Zhixian Yan. 2012a. Discovering areas of interest with geo-tagged images and check-ins. In Proceedings of the International Conference on Multimedia. 589--598.
[27]
Jiebo Luo, Dhiraj Joshi, Jie Yu, and Andrew C. Gallagher. 2011. Geotagging in multimedia and computer vision - A survey. Multimedia Tools Appl. 51, 1 (2011), 187--211.
[28]
Kevin P. Murphy, Yair Weiss, and Michael I. Jordan. 1999. Loopy belief propagation for approximate inference: An empirical study. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI). 467--475.
[29]
Symeon Papadopoulos, Christos Zigkolis, Stefanos Kapiris, Yiannis Kompatsiaris, and Athena Vakali. 2010. ClustTour: City exploration by use of hybrid photo clustering. In Proceedings of the International Conference on Multimedia. 1617--1620.
[30]
Devi Parikh and Kristen Grauman. 2011. Relative attributes. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 503--510.
[31]
Sobhan Naderi Parizi, John G. Oberlin, and Pedro F. Felzenszwalb. 2012. Reconfigurable models for scene recognition. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 2775--2782.
[32]
Genevieve Patterson and James Hays. 2012. SUN attribute database: Discovering, annotating, and recognizing scene attributes. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 2751--2758.
[33]
Tye Rattenbury and Mor Naaman. 2009. Methods for extracting place semantics from Flickr tags. ACM Trans. Web 3, 1 (2009), 1.
[34]
Jitao Sang, Changsheng Xu, and Jing Liu. 2012. User-aware image tag refinement via ternary semantic analysis. IEEE Trans. Multimedia 14, 3--2 (2012), 883--895.
[35]
Grant Schindler, Matthew Brown, and Richard Szeliski. 2007. City-scale location recognition. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR).
[36]
Pavel Serdyukov, Vanessa Murdock, and Roelof van Zwol. 2009. Placing flickr photos on a map. In Proceedings of the ACM SIGIR Conference. 484--491.
[37]
Jan van Gemert, Cor J. Veenman, Arnold W. M. Smeulders, and Jan-Mark Geusebroek. 2010. Visual word ambiguity. IEEE Trans. Pattern Anal. Mach. Intell. 32, 7 (2010), 1271--1283.
[38]
Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas S. Huang, and Yihong Gong. 2010. Locality-constrained linear coding for image classification. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 3360--3367.
[39]
Xian Xiao, Changsheng Xu, Jinqiao Wang, and Min Xu. 2012. Enhanced 3-D Modeling for landmark image classification. IEEE Trans. Multimedia 14, 4 (2012), 1246--1258.
[40]
Oksana Yakhnenko, Jakob Verbeek, and Cordelia Schmid. 2011. Region-based image classification with a latent SVM model. Rapport de recherche RR-7665. INRIA. http://hal.inria.fr/inria-00605344
[41]
Chun-Nam John Yu and Thorsten Joachims. 2009. Learning structural SVMs with latent variables. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML). 1169--1176.
[42]
Zheng-Jun Zha, Meng Wang, Yan-Tao Zheng, Yi Yang, Richang Hong, and Tat-Seng Chua. 2012. Interactive video indexing with statistical active learning. IEEE Trans. Multimedia 14, 1 (2012), 17--27.
[43]
Zheng-Jun Zha, Linjun Yang, Tao Mei, Meng Wang, and Zengfu Wang. 2009. Visual query suggestion. In Proceedings of the Conference on ACM Multimedia. 15--24.
[44]
Zheng-Jun Zha, Linjun Yang, Tao Mei, Meng Wang, Zengfu Wang, Tat-Seng Chua, and Xian-Sheng Hua. 2010. Visual query suggestion: Towards capturing user intent in internet image search. ACM Trans. Multimedia Comput. Commun. Appl. 6, 3 (2010).
[45]
Zheng-Jun Zha, Hanwang Zhang, Meng Wang, Huan-Bo Luan, and Tat-Seng Chua. 2013. Detecting group activities with multi-camera context. IEEE Trans. Circuits Syst. Video Technol. 23, 5 (2013), 856--869.
[46]
Yantao Zheng, Ming Zhao, Yang Song, Hartwig Adam, Ulrich Buddemeier, Alessandro Bissacco, Fernando Brucher, Tat-Seng Chua, and Hartmut Neven. 2009. Tour the world: Building a web-scale landmark recognition engine. In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (CVPR). 1085--1092.
[47]
Yan-Tao Zheng, Zheng-Jun Zha, and Tat-Seng Chua. 2011. Research and applications on georeferenced multimedia: A survey. Multimedia Tools Appl. 51, 1 (2011), 77--98.
[48]
Yan-Tao Zheng, Zheng-Jun Zha, and Tat-Seng Chua. 2012. Mining travel patterns from geotagged photos. ACM Trans. Intell. Syst. Technol. 3, 3 (2012), 56.
[49]
Jianke Zhu, Steven C. H. Hoi, Michael R. Lyu, and Shuicheng Yan. 2008. Near-duplicate keyframe retrieval by nonrigid image matching. In Proceedings of the Conference on ACM Multimedia. 41--50.

Cited By

View all
  • (2018)Landmark Image Retrieval by Jointing Feature Refinement and Multimodal Classifier LearningIEEE Transactions on Cybernetics10.1109/TCYB.2017.271279848:6(1682-1695)Online publication date: Jun-2018
  • (2018)Unsupervised geographically discriminative feature learning for landmark taggingKnowledge-Based Systems10.1016/j.knosys.2018.03.005149:C(143-154)Online publication date: 1-Jun-2018
  • (2017)Accurate vehicle self-localization in high definition map datasetProceedings of the 1st ACM SIGSPATIAL Workshop on High-Precision Maps and Intelligent Applications for Autonomous Vehicles10.1145/3149092.3149094(1-8)Online publication date: 7-Nov-2017
  • Show More Cited By

Index Terms

  1. Discovering Geo-Informative Attributes for Location Recognition and Exploration

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 11, Issue 1s
      Special Issue on Multiple Sensorial (MulSeMedia) Multimodal Media : Advances and Applications
      September 2014
      260 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/2675060
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 October 2014
      Accepted: 01 June 2014
      Revised: 01 June 2014
      Received: 01 January 2014
      Published in TOMM Volume 11, Issue 1s

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Geo-informative attributes
      2. latent model
      3. location recognition

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)4
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 03 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2018)Landmark Image Retrieval by Jointing Feature Refinement and Multimodal Classifier LearningIEEE Transactions on Cybernetics10.1109/TCYB.2017.271279848:6(1682-1695)Online publication date: Jun-2018
      • (2018)Unsupervised geographically discriminative feature learning for landmark taggingKnowledge-Based Systems10.1016/j.knosys.2018.03.005149:C(143-154)Online publication date: 1-Jun-2018
      • (2017)Accurate vehicle self-localization in high definition map datasetProceedings of the 1st ACM SIGSPATIAL Workshop on High-Precision Maps and Intelligent Applications for Autonomous Vehicles10.1145/3149092.3149094(1-8)Online publication date: 7-Nov-2017
      • (2016)Analyzing human appearance as a cue for dating images2016 IEEE Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV.2016.7477678(1-8)Online publication date: Mar-2016
      • (2016)Folksonomy-Based Visual Ontology Construction and Its ApplicationsIEEE Transactions on Multimedia10.1109/TMM.2016.252760218:4(702-713)Online publication date: 1-Apr-2016
      • (2015)Wide-Area Image Geolocalization with Aerial Reference ImageryProceedings of the 2015 IEEE International Conference on Computer Vision (ICCV)10.1109/ICCV.2015.451(3961-3969)Online publication date: 7-Dec-2015
      • (2015)On the location dependence of convolutional neural network features2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW.2015.7301385(70-78)Online publication date: Jun-2015
      • (2015)Weather-Adaptive Distance Metric for Landmark Image ClassificationProceedings, Part II, of the 16th Pacific-Rim Conference on Advances in Multimedia Information Processing -- PCM 2015 - Volume 931510.1007/978-3-319-24078-7_14(139-148)Online publication date: 16-Sep-2015

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media