research-article

Active query sensing: Suggesting the best query view for mobile visual search

Authors:

Shih-Fu ChangAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 8, Issue 3s

Article No.: 40, Pages 1 - 21

https://doi.org/10.1145/2348816.2348819

Published: 16 October 2012 Publication History

Abstract

While much exciting progress is being made in mobile visual search, one important question has been left unexplored in all current systems. When searching objects or scenes in the 3D world, which viewing angle is more likely to be successful? More particularly, if the first query fails to find the right target, how should the user control the mobile camera to form the second query? In this article, we propose a novel Active Query Sensing system for mobile location search, which actively suggests the best subsequent query view to recognize the physical location in the mobile environment. The proposed system includes two unique components: (1) an offline process for analyzing the saliencies of different views associated with each geographical location, which predicts the location search precisions of individual views by modeling their self-retrieval score distributions. (2) an online process for estimating the view of an unseen query, and suggesting the best subsequent view change. Specifically, the optimal viewing angle change for the next query can be formulated as an online information theoretic approach. Using a scalable visual search system implemented over a NYC street view dataset (0.3 million images), we show a performance gain by reducing the failure rate of mobile location search to only 12% after the second query. We have also implemented an end-to-end functional system, including user interfaces on iPhones, client-server communication, and a remote search server. This work may open up an exciting new direction for developing interactive mobile media applications through the innovative exploitation of active sensing and query formulation.

References

[1]

Baatz, G., Koser, K., Chen, D., Grzeszczuk, R., and Pollefeys, M. 2010. Handling urban location recognition as a 2d homothetic problem. In Proceedings of the 10th European Conference on Computer Vision.

Digital Library

[2]

Chen, D., Baatz, G., Koser, K., Tsai, S., Vedantham, R., Pylvanainen, T., Roimela, K., Chen, X., Bach, J., Pollefeys, M., Girod, B., and Grzeszczuk, R. 2011. City-scale landmark identification on mobile devices. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR).

Digital Library

[3]

Crandall, D., Backstrom, L., and Huttenlocher, D. 2009. Mapping the world's photos. In Proceedings of the 18th International World Wide Web Conference (WWW).

Digital Library

[4]

Datta, R., Joshi, D., Li, J., and Wang, J. 2008. Image retrieval ideas, influences, and trends of the new age. ACM Comput. Surv.

Digital Library

[5]

Eade, E. and Drummond, T. 2008. Unified loop closing and recovery for real time monocular slam. In Proceedings of the British Machine Vision Conference (BMVC).

[6]

Girod, B., Chandrasekhar, V., Chen, D., Cheung, N.-M., Grzeszczuk, R., Reznik, Y., Takacs, G., Tsai, S., and Vedantham, R. 2011. Mobile visual search. IEEE Sig. Process. Mag.

[7]

Goggles. http://www.google.com/mobile/goggles/.

[8]

He, B. and Ounis, I. 2004. Infer query performance using pre-retrieval predictors. In Proceedings of the International Symposium on String Processing and Information Retrieval.

[9]

He, J., Feng, J., Lin, T.-H., Liu, X., and Chang, S.-F. 2012. Mobile product search with bag of hash bits and boundary rerankings. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR).

Digital Library

[10]

Irschara, A., Zach, C., Frahm, J., and Bischof, H. 2009. From structure-from-motion point clouds to fast location recognition. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR).

[11]

Kaneva, B., Sivic, J., Torralba, A., Avidan, S., and Freeman, W. 2010. Matching and predicting street level images. In Workshop on Vision for Cognitive Tasks, Proceedings of the 10th European Conference on Computer Vision (ECCV).

[12]

Knopp, J., Sivic, J., and Pajdla, T. 2010. Avoiding confusing features in place recognition. In Proceedings of the 10th European Conference on Computer Vision (ECCV).

Digital Library

[13]

Kooaba. http://www.kooaba.com/.

[14]

Kwok, K., Grunfeld, L., Sun, H., Deng, P., and Dinstl, N. 2004. Robust track experiments using pircs. In Proceedings of the 13th Text Retrieval Conference (TREC).

[15]

Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. In Int. J. Comput. Vis.

Digital Library

[16]

Nister, D. and Stewenius, H. 2006. Scalable recognition with a vocabulary tree. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR).

Digital Library

[17]

Oliva, A. and Torralba, A. 2001. Modeling the shape of the scene a holistic representation of the spatial envelope. Int. J. Comput. Vis.

Digital Library

[18]

Point and Find. http://www.pointandfind.nokia.com/.

[19]

Rui, Y., Huang, T., and Chang, S. 1999. Image retrieval current techniques, promising directions, and open issues. J. Vis. Commun. Image Represent.

Digital Library

[20]

Schindler, G. and Brown, M. 2007. City-scale location recognition. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR).

[21]

Sivic, J. and Zisserman, A. 2003. Video google a text retrieval approach to object matching in videos. In Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV).

Digital Library

[22]

Snaptell. http://www.snaptell.com/.

[23]

Tong, S. and Chang, E. 2001. Support vector machine active learning for image retrieval. In Proceedings of ACM Multimedia.

Digital Library

[24]

Wu, J. and Rehg, J. 2009. Beyond the euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV).

[25]

YomTov, E., Fine, S., Carmel, D., and Darlow, A. 2005. Learning to estimate query difficulty. In Proceedings of ACM SIGIR.

Digital Library

[26]

Yu, F., Ji, R., and Chang, S.-F. 2011. Active query sensing for mobile location search. In Proceedings of ACM Multimedia.

Digital Library

[27]

Zha, Z.-J., Yang, L., Mei, T., Wang, M., and Wang, Z. 2009. Visual query suggestion. In Proceedings of ACM Multimedia.

Digital Library

[28]

Zhang, W. and Kosecka, J. 2006. Image based localization in urban environments. In Proceedings of the 3rd International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT).

Digital Library

Cited By

Çalışır FBaştan MUlusoy ÖGüdükbay U(2017)Mobile multi-view object image searchMultimedia Tools and Applications10.1007/s11042-016-3659-976:10(12433-12456)Online publication date: 1-May-2017
https://dl.acm.org/doi/10.1007/s11042-016-3659-9
You QYuan JWang JGuo PLuo J(2015)Snap n' shop: Visual search-based mobile shopping made a breeze by machine and crowd intelligenceProceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)10.1109/ICOSC.2015.7050803(173-180)Online publication date: Feb-2015
https://doi.org/10.1109/ICOSC.2015.7050803

Index Terms

Active query sensing: Suggesting the best query view for mobile visual search
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
2. Information systems
  1. Information retrieval
  2. Information systems applications

Recommendations

Active query sensing for mobile location search
MM '11: Proceedings of the 19th ACM international conference on Multimedia

While much exciting progress is being made in mobile visual search, one important question has been left unexplored in all current systems. When the first query fails to find the right target (up to 50% likelihood), how should the user form his/her ...
Intelligent query formulation for mobile visual search
MM '11: Proceedings of the 19th ACM international conference on Multimedia

While much progress is being made in mobile visual search, most efforts are on how to improve search performance (precision, recall, speed) given queries. How to help the user form a good query has generally left unexplored. Successful mobile search ...
A mobile location search system with active query sensing
MM '11: Proceedings of the 19th ACM international conference on Multimedia

How should the second query be taken once the first query fails in mobile location search based on visual recognition? In this demo, we describe a mobile search system with a unique Active Query Sensing (AQS) function to intelligently guide the mobile ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 8, Issue 3s

Special section of best papers of ACM multimedia 2011, and special section on 3D mobile multimedia

September 2012

173 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/2348816

Issue’s Table of Contents

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 October 2012

Accepted: 01 June 2012

Revised: 01 May 2012

Received: 01 March 2012

Published in TOMM Volume 8, Issue 3s

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Division of Computer and Network Systems

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
347
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Çalışır FBaştan MUlusoy ÖGüdükbay U(2017)Mobile multi-view object image searchMultimedia Tools and Applications10.1007/s11042-016-3659-976:10(12433-12456)Online publication date: 1-May-2017
https://dl.acm.org/doi/10.1007/s11042-016-3659-9
You QYuan JWang JGuo PLuo J(2015)Snap n' shop: Visual search-based mobile shopping made a breeze by machine and crowd intelligenceProceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)10.1109/ICOSC.2015.7050803(173-180)Online publication date: Feb-2015
https://doi.org/10.1109/ICOSC.2015.7050803

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents