ABSTRACT
While on the go, more and more people are using their phones to enjoy ubiquitous location-based services (LBS). One of the fundamental problems of LBS is localization. Researchers are now investigating ways to use a phone-captured image for localization as it contains more scene context information than the embedded sensors. In this paper, we present a novel approach to mobile visual localization that accurately senses geographic scene context according to the current image (typically associated with a rough GPS position). Unlike most existing visual localization methods, the proposed approach is capable of providing a complete set of more accurate parameters about the scene geo---including the actual locations of both the mobile user and perhaps more importantly the captured scene along with the viewing direction. Our approach takes advantage of advanced techniques for large-scale image retrieval and 3D model reconstruction from photos. Specifically, we first perform joint geo-visual clustering in the cloud to generate scene clusters, with each scene represented by a 3D model. The 3D scene models are then indexed using a visual vocabulary tree structure. The phone-captured image is used to retrieve the relevant scene models, then aligned with the models, and further registered to the real-world map. Our approach achieves an estimation accuracy of user location within 14 meters, viewing direction within 9 degrees, and scene location within 21 meters. Such a complete set of accurate geo-parameters can lead to various LBS applications for routing that cannot be achieved with most existing methods. In particular, we showcase three novel applications: 1) accurate self-localization, 2) collaborative localization for rendezvous routing, and 3) routing for photographing. The evaluations through user studies indicate these applications are effective for facilitating the perfect rendezvous for mobile users.
Supplemental Material
- S. Arya, D. Mount, N. Netanyahu, R. Silverman, and A. Wu. An optimal algorithm for approximate nearest neighbor searching fixed dimensions. JACM, 1998. Google ScholarDigital Library
- Y. Avrithis, Y. Kalantidis, G. Tolias, and E. Spyrou. Retrieving landmark and non-landmark images from community photo collections. In ACM Multimedia, 2010. Google ScholarDigital Library
- Bing street side. http://www.microsoft.com/maps/streetside.aspx.Google Scholar
- S. Bourke, K. McCarthy, and B. Smyth. The social camera: a case-study in contextual image recommendation. In IUI, 2011. Google ScholarDigital Library
- D. Chen, G. Baatz, K. Koser, S. Tsai, R. Vedantham, T. Pylvanainen, K. Roimela, X. Chen, J. Bach, M. Pollefeys, et al. City-scale landmark identification on mobile devices. In CVPR, 2011. Google ScholarDigital Library
- Flickr. http://www.flickr.com/.Google Scholar
- B. J. Frey and D. Dueck. Clustering by passing messages between data points. Science, 2007.Google Scholar
- Google Earth. http://earth.google.com/.Google Scholar
- Google street view. http://www.google.com/streetview.Google Scholar
- R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Second edition, 2004. Google ScholarDigital Library
- J. Hays and A. Efros. Im2gps: estimating geographic information from a single image. In CVPR, 2008.Google ScholarCross Ref
- A. Irschara, C. Zach, J. Frahm, and H. Bischof. From structure-from-motion point clouds to fast location recognition. In CVPR, 2009.Google ScholarCross Ref
- R. Ji, L. Duan, J. Chen, H. Yao, Y. Rui, S. Chang, and W. Gao. Towards low bit rate mobile visual search with multiple-channel coding. In ACM Multimedia, 2011. Google ScholarDigital Library
- K. Josephson and M. Byrod. Pose estimation with radial distortion and unknown focal length. In CVPR, 2009.Google ScholarCross Ref
- M. Kroepfl, Y. Wexler, and E. Ofek. Efficiently locating photographs in many panoramas. In GIS, 2010. Google ScholarDigital Library
- X. Li, C. Wu, C. Zach, S. Lazebnik, and J. Frahm. Modeling and recognition of landmark image collections using iconic scene graphs. In ECCV, 2008. Google ScholarDigital Library
- Y. Li, N. Snavely, and D. Huttenlocher. Location recognition using prioritized feature matching. In ECCV, 2010. Google ScholarDigital Library
- C. Liu, J. Yuen, A. Torralba, J. Sivic, and W. Freeman. Sift flow: Dense correspondence across different scenes. In ECCV, 2008. Google ScholarDigital Library
- D. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 2004. Google ScholarDigital Library
- J. Luo, D. Joshi, J. Yu, and A. Gallagher. Geotagging in multimedia and computer vision: a survey. MTA, 2011. Google ScholarDigital Library
- Z. Luo, H. Li, J. Tang, R. Hong, and T. Chua. Viewfocus: explore places of interests on google maps using photos with view direction filtering. In ACM Multimedia, 2009. Google ScholarDigital Library
- D. Nistér. An efficient solution to the five-point relative pose problem. PAMI, 2004.Google ScholarDigital Library
- D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In CVPR, 2006. Google ScholarDigital Library
- Panoramio. http://www.panoramio.com/.Google Scholar
- M. Park, J. Luo, R. Collins, and Y. Liu. Beyond gps: determining the camera viewing direction of a geotagged image. In ACM Multimedia, 2010. Google ScholarDigital Library
- J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In CVPR, 2007.Google ScholarCross Ref
- T. Sattler, B. Leibe, and L. Kobbelt. Fast image-based localization using direct 2d-to-3d matching. In ICCV, 2011. Google ScholarDigital Library
- G. Schindler, M. Brown, and R. Szeliski. City-scale location recognition. In CVPR, 2007.Google ScholarCross Ref
- G. Schroth, R. Huitl, D. Chen, M. Abu-Alqumsan, A. Al-Nuaimi, and E. Steinbach. Mobile visual location recognition. Signal Processing Magazine, 2011.Google Scholar
- N. Snavely, S. Seitz, and R. Szeliski. Photo tourism: exploring photo collections in 3d. In TOG, 2006. Google ScholarDigital Library
- F. Yu, R. Ji, and S. Chang. Active query sensing for mobile location search. In ACM Multimedia, 2011. Google ScholarDigital Library
- A. Zamir and M. Shah. Accurate image localization based on google maps street view. In ECCV, 2010. Google ScholarDigital Library
- W. Zhang and J. Kosecka. Image based localization in urban environments. In 3DPVT, 2006. Google ScholarDigital Library
- W. Zhou, Y. Lu, H. Li, Y. Song, and Q. Tian. Spatial coding for large scale partial-duplicate web image search. In ACM Multimedia, 2010. Google ScholarDigital Library
Index Terms
- Finding perfect rendezvous on the go: accurate mobile visual localization and its applications to routing
Recommendations
Robust and accurate mobile visual localization and its applications
Special Sections on the 20th Anniversary of ACM International Conference on Multimedia, Best Papers of ACM Multimedia 2012Mobile applications are becoming increasingly popular. More and more people are using their phones to enjoy ubiquitous location-based services (LBS). The increasing popularity of LBS creates a fundamental problem: mobile localization. Besides ...
AMIGO: accurate mobile image geotagging
ICIMCS '12: Proceedings of the 4th International Conference on Internet Multimedia Computing and ServiceWith location-based services gaining popularity among mobile users, researchers are exploring the way using the phone-captured image for localization as it contains more context information than the embedded sensory GPS coordinates. We present in this ...
Accurate sensing of scene geo-context via mobile visual localization
Image geo-tagging has drawn a great deal of attention in recent years. The geographic information associated with images can be used to promote potential applications such as location recognition or virtual navigation. In this paper, we propose a novel ...
Comments