skip to main content
10.1145/2393347.2393357acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Finding perfect rendezvous on the go: accurate mobile visual localization and its applications to routing

Published: 29 October 2012 Publication History

Abstract

While on the go, more and more people are using their phones to enjoy ubiquitous location-based services (LBS). One of the fundamental problems of LBS is localization. Researchers are now investigating ways to use a phone-captured image for localization as it contains more scene context information than the embedded sensors. In this paper, we present a novel approach to mobile visual localization that accurately senses geographic scene context according to the current image (typically associated with a rough GPS position). Unlike most existing visual localization methods, the proposed approach is capable of providing a complete set of more accurate parameters about the scene geo---including the actual locations of both the mobile user and perhaps more importantly the captured scene along with the viewing direction. Our approach takes advantage of advanced techniques for large-scale image retrieval and 3D model reconstruction from photos. Specifically, we first perform joint geo-visual clustering in the cloud to generate scene clusters, with each scene represented by a 3D model. The 3D scene models are then indexed using a visual vocabulary tree structure. The phone-captured image is used to retrieve the relevant scene models, then aligned with the models, and further registered to the real-world map. Our approach achieves an estimation accuracy of user location within 14 meters, viewing direction within 9 degrees, and scene location within 21 meters. Such a complete set of accurate geo-parameters can lead to various LBS applications for routing that cannot be achieved with most existing methods. In particular, we showcase three novel applications: 1) accurate self-localization, 2) collaborative localization for rendezvous routing, and 3) routing for photographing. The evaluations through user studies indicate these applications are effective for facilitating the perfect rendezvous for mobile users.

Supplementary Material

JPG File (d15.jpg)
MP4 File (d15.mp4)

References

[1]
S. Arya, D. Mount, N. Netanyahu, R. Silverman, and A. Wu. An optimal algorithm for approximate nearest neighbor searching fixed dimensions. JACM, 1998.
[2]
Y. Avrithis, Y. Kalantidis, G. Tolias, and E. Spyrou. Retrieving landmark and non-landmark images from community photo collections. In ACM Multimedia, 2010.
[3]
Bing street side. http://www.microsoft.com/maps/streetside.aspx.
[4]
S. Bourke, K. McCarthy, and B. Smyth. The social camera: a case-study in contextual image recommendation. In IUI, 2011.
[5]
D. Chen, G. Baatz, K. Koser, S. Tsai, R. Vedantham, T. Pylvanainen, K. Roimela, X. Chen, J. Bach, M. Pollefeys, et al. City-scale landmark identification on mobile devices. In CVPR, 2011.
[6]
Flickr. http://www.flickr.com/.
[7]
B. J. Frey and D. Dueck. Clustering by passing messages between data points. Science, 2007.
[8]
Google Earth. http://earth.google.com/.
[9]
Google street view. http://www.google.com/streetview.
[10]
R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Second edition, 2004.
[11]
J. Hays and A. Efros. Im2gps: estimating geographic information from a single image. In CVPR, 2008.
[12]
A. Irschara, C. Zach, J. Frahm, and H. Bischof. From structure-from-motion point clouds to fast location recognition. In CVPR, 2009.
[13]
R. Ji, L. Duan, J. Chen, H. Yao, Y. Rui, S. Chang, and W. Gao. Towards low bit rate mobile visual search with multiple-channel coding. In ACM Multimedia, 2011.
[14]
K. Josephson and M. Byrod. Pose estimation with radial distortion and unknown focal length. In CVPR, 2009.
[15]
M. Kroepfl, Y. Wexler, and E. Ofek. Efficiently locating photographs in many panoramas. In GIS, 2010.
[16]
X. Li, C. Wu, C. Zach, S. Lazebnik, and J. Frahm. Modeling and recognition of landmark image collections using iconic scene graphs. In ECCV, 2008.
[17]
Y. Li, N. Snavely, and D. Huttenlocher. Location recognition using prioritized feature matching. In ECCV, 2010.
[18]
C. Liu, J. Yuen, A. Torralba, J. Sivic, and W. Freeman. Sift flow: Dense correspondence across different scenes. In ECCV, 2008.
[19]
D. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 2004.
[20]
J. Luo, D. Joshi, J. Yu, and A. Gallagher. Geotagging in multimedia and computer vision: a survey. MTA, 2011.
[21]
Z. Luo, H. Li, J. Tang, R. Hong, and T. Chua. Viewfocus: explore places of interests on google maps using photos with view direction filtering. In ACM Multimedia, 2009.
[22]
D. Nistér. An efficient solution to the five-point relative pose problem. PAMI, 2004.
[23]
D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In CVPR, 2006.
[24]
Panoramio. http://www.panoramio.com/.
[25]
M. Park, J. Luo, R. Collins, and Y. Liu. Beyond gps: determining the camera viewing direction of a geotagged image. In ACM Multimedia, 2010.
[26]
J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In CVPR, 2007.
[27]
T. Sattler, B. Leibe, and L. Kobbelt. Fast image-based localization using direct 2d-to-3d matching. In ICCV, 2011.
[28]
G. Schindler, M. Brown, and R. Szeliski. City-scale location recognition. In CVPR, 2007.
[29]
G. Schroth, R. Huitl, D. Chen, M. Abu-Alqumsan, A. Al-Nuaimi, and E. Steinbach. Mobile visual location recognition. Signal Processing Magazine, 2011.
[30]
N. Snavely, S. Seitz, and R. Szeliski. Photo tourism: exploring photo collections in 3d. In TOG, 2006.
[31]
F. Yu, R. Ji, and S. Chang. Active query sensing for mobile location search. In ACM Multimedia, 2011.
[32]
A. Zamir and M. Shah. Accurate image localization based on google maps street view. In ECCV, 2010.
[33]
W. Zhang and J. Kosecka. Image based localization in urban environments. In 3DPVT, 2006.
[34]
W. Zhou, Y. Lu, H. Li, Y. Song, and Q. Tian. Spatial coding for large scale partial-duplicate web image search. In ACM Multimedia, 2010.

Cited By

View all
  • (2022)Geometry-Constrained Scale Estimation for Monocular Visual OdometryIEEE Transactions on Multimedia10.1109/TMM.2021.309377124(3144-3156)Online publication date: 1-Jan-2022
  • (2019)On-Device Scalable Image-Based Localization via Prioritized Cascade Search and Fast One-Many RANSACIEEE Transactions on Image Processing10.1109/TIP.2018.288182928:4(1675-1690)Online publication date: Apr-2019
  • (2018)Multi-source fusion based geo-tagging for web imagesMultimedia Tools and Applications10.5555/3267983.326806077:13(16399-16417)Online publication date: 1-Jul-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '12: Proceedings of the 20th ACM international conference on Multimedia
October 2012
1584 pages
ISBN:9781450310895
DOI:10.1145/2393347
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. geo-tagging
  2. location-based services
  3. mobile visual localization
  4. scene reconstruction

Qualifiers

  • Research-article

Conference

MM '12
Sponsor:
MM '12: ACM Multimedia Conference
October 29 - November 2, 2012
Nara, Japan

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Geometry-Constrained Scale Estimation for Monocular Visual OdometryIEEE Transactions on Multimedia10.1109/TMM.2021.309377124(3144-3156)Online publication date: 1-Jan-2022
  • (2019)On-Device Scalable Image-Based Localization via Prioritized Cascade Search and Fast One-Many RANSACIEEE Transactions on Image Processing10.1109/TIP.2018.288182928:4(1675-1690)Online publication date: Apr-2019
  • (2018)Multi-source fusion based geo-tagging for web imagesMultimedia Tools and Applications10.5555/3267983.326806077:13(16399-16417)Online publication date: 1-Jul-2018
  • (2018)3D Image-based Indoor Localization Joint With WiFi PositioningProceedings of the 2018 ACM on International Conference on Multimedia Retrieval10.1145/3206025.3206070(465-472)Online publication date: 5-Jun-2018
  • (2018)POI Summarization by Aesthetics Evaluation From Crowd Source Social MediaIEEE Transactions on Image Processing10.1109/TIP.2017.276945427:3(1178-1189)Online publication date: Mar-2018
  • (2018)Landmark Image Retrieval by Jointing Feature Refinement and Multimodal Classifier LearningIEEE Transactions on Cybernetics10.1109/TCYB.2017.271279848:6(1682-1695)Online publication date: Jun-2018
  • (2017)LiveMapsProceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems10.1145/3139958.3139965(1-9)Online publication date: 7-Nov-2017
  • (2017)LiveMapsProceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3077136.3080673(897-900)Online publication date: 7-Aug-2017
  • (2017)Exploiting Social-Mobile Information for Location VisualizationACM Transactions on Intelligent Systems and Technology10.1145/30015948:3(1-19)Online publication date: 12-Jan-2017
  • (2017)Image Location Inference by Multisaliency EnhancementIEEE Transactions on Multimedia10.1109/TMM.2016.263820719:4(813-821)Online publication date: 1-Apr-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media