skip to main content
research-article

A unified context model for web image retrieval

Published: 06 August 2012 Publication History

Abstract

Content-based web image retrieval based on the query-by-example (QBE) principle remains a challenging problem due to the semantic gap as well as the gap between a user's intent and the representativeness of a typical image query. In this article, we propose to address this problem by integrating query-related contextual information into an advanced query model to improve the performance of QBE-based web image retrieval. We consider both the local and global context of the query image. The local context can be inferred from the web pages and the click-through log associated with the query image, while the global context is derived from the entire corpus comprising all web images and the associated web pages. To effectively incorporate the local query context we propose a language modeling based approach to deal with the combined structured query representation from the contextual and visual information. The global query context is integrated by the multi-modal relevance model to “reconstruct” the query from the document models indexed in the corpus. In this way, the global query context is employed to address the noise or missing information in the query and its local context, so that a comprehensive and robust query model can be obtained. We evaluated the proposed approach on a representative product image dataset collected from the web and demonstrated that the inclusion of the local and global query contexts significantly improves the performance of QBE-based web image retrieval.

References

[1]
Belkin, N. J. 2008. Some(what) grand challenges for information retrieval. In Proceedings of the 30th European Conference on IR Research on Advances in Information Retrieval (ECIR'08). 1.
[2]
Cao, L., Luo, J., Kautz, H., and Huang, T. 2009. Image annotation within the context of personal photo collections using hierarchical event and scene models. IEEE Trans. Multimed. 11, 2, 208--219.
[3]
Chang, S.-F., Hsu, W., Jiang, W., Kennedy, L., Xu, D., Yanagawa, A., and Zavesky, E. 2006. Columbia University TRECVID-2006 video search and high-level feature extraction. In Proceedings of the NIST TRECVID Workshop.
[4]
Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40, 2, 1--60.
[5]
Dou, Z., Song, R., Wen, J.-R., and Yuan, X. 2008. Evaluating the effectiveness of personalized web search. IEEE Trans. Knowl. Data Eng. 21, 1178--1190.
[6]
Geng, B., Yang, L., and Xu, C. 2009. A study of language model for image retrieval. In Proceedings of the IEEE International Conference on Data Mining Workshops (ICDMW '09). IEEE Computer Society, Los Alamitos, CA, 158--163.
[7]
Järvelin, K. and Kekäläinen, J. 2000. Ir evaluation methods for retrieving highly relevant documents. In Proceedings of the 23th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '00). 41--48.
[8]
Jing, F., Li, M., Zhang, H.-J., and Zhang, B. 2005. A unified framework for image retrieval using keyword and visual features. IEEE Trans. Image Process.
[9]
Lafferty, J. and Zhai, C. 2001. Document language models, query models, and risk minimization for information retrieval. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '01). ACM, New York, 111--119.
[10]
Lavrenko, V., Choquette, M., and Croft, W. B. 2002. Cross-lingual relevance models. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '02). ACM, New York, 175--182.
[11]
Lavrenko, V. and Croft, W. B. 2001. Relevance based language models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '01). ACM, New York, 120--127.
[12]
Lawrence, S. 2000. Context in web search. IEEE Data Eng. Bull. 23, 3, 25--32.
[13]
Li, D., Yang, L., Hua, X.-S., and Zhang, H.-J. 2010. Large-scale robust visual codebook construction. In Proceedings of the 18th ACM International Conference on Multimedia (MULTIMEDIA '10).
[14]
Luo, J., Hanjalic, A., Tian, Q., and Jaimes, A. 2009. Integration of context and content for multimedia management: An introduction to the special issue. IEEE Trans. Multimed. 11, 2(Jan.), 193--195.
[15]
Manning, C. D., Raghavan, P., and Schütze, H. 2008. Introduction to Information Retrieval. Cambridge University Press.
[16]
Mei, T., Hua, X.-S., and Li, S. 2008a. Contextual in-image advertising. In Proceeding of the 16th ACM International Conference on Multimedia (MULTIMEDIA '08). ACM, New York, 439--448.
[17]
Mei, T., Zha, Z.-J., Liu, Y., Wang, M., Qi, G.-J., Tian, X., Wang, J., Yang, L., and Hua, X.-S. 2008b. MSRA at TRECVID 2008: High-level feature extraction and automatic search. In Proceedings of TRECVID.
[18]
Page, L., Brin, S., Motwani, R., and Winograd, T. 1998. The pagerank citation ranking: Bringing order to the web. Tech. rep., Stanford Digital Library Technologies Project.
[19]
Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. 2007. Object retrieval with large vocabularies and fast spatial matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '07). 1--8.
[20]
Robertson, S. and Zaragoza, H. 2009. The probabilistic relevance framework: Bm25 and beyond. Found. Trends Inf. Retriev. 3, 333--389.
[21]
Robertson, S., Zaragoza, H., and Taylor, M. 2004. Simple bm25 extension to multiple weighted fields. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management (CIKM '04). ACM, New York, 42--49.
[22]
Rui, Y., Huang, T., Ortega, M., and Mehrotra, S. 2002. Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circ. Syst. Video Tech. 8, 5, 644--655.
[23]
Scholer, F. and Williams, H. E. 2002. Query association for effective retrieval. In Proceedings of the 11th International Conference on Information and Knowledge Management (CIKM '02). ACM, New York, 324--331.
[24]
Scholer, F., Williams, H. E., and Turpin, A. 2004. Query association surrogates for web search: Research articles. J. Amer. Soc. Info. Sci. Technol. 55, 7, 637--650.
[25]
Shen, H., Ooi, B., and Tan, K. 2000. Giving meanings to www images. In Proceedings of the 8th ACM International Conference on Multimedia (MULTIMEDIA '00). 39--47.
[26]
Sinha, P. and Jain, R. 2008. Semantics in digital photos: A contextual analysis. In Proceedings of the IEEE International Conference on Semantic Computing (ICSC '08). IEEE Computer Society, Los Alamitos, CA, 58--65.
[27]
Sivic, J. and Zisserman, A. 2003. Video google: A text retrieval approach to object matching in videos. In Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV '03). IEEE Computer Society, Los Alamitos, CA, 1470.
[28]
Smeulders, A. W. M., Worring, M., Santini, S., Gupta, A., and Jain, R. 2000. Content-based image retrieval at the end of the early years. IEEE Trans. Patt. Anal. Machine Intel. 22, 12, 1349--1380.
[29]
Snoek, C. G. M. and Worring, M. 2009. Concept-based video retrieval. Found. Trends Inf. Retr. 4, 2, 215--322.
[30]
Tian, X., Yang, L., Wang, J., Yang, Y., Wu, X., and Hua, X.-S. 2008. Bayesian video search reranking. In Proceeding of the 16th ACM International Conference on Multimedia (MULTIMEDIA '08).
[31]
Vapnik, V. N. 1998. Statistical Learning Theory. Wiley.
[32]
Wang, X.-J., Ma, W.-Y., Xue, G.-R., and Li, X. 2004. Multi-model similarity propagation and its application for web image retrieval. In Proceedings of the 12th Annual ACM International Conference on Multimedia (MULTIMEDIA '04). ACM, New York, 944--951.
[33]
Wang, X. J., Zhang, L., Li, X., and Ma, W. Y. 2008. Annotating images by mining image search results. IEEE Trans. Patt. Anal. Mach. Intell. 30, 11, 1919--1932.
[34]
Westerveld, T., Vries, A. P. D., van Ballegooij, A., de Jong, F., and Hiemstra, D. 2003. A probabilistic multimedia retrieval model and its evaluation. EURASIP J. Appl. Signal Process. 2003, 186--198.
[35]
Wilkinson, R. 1994. Effective retrieval of structured documents. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '94). Springer-Verlag, Berlin, 311--317.
[36]
Xue, G.-R., Zeng, H.-J., Chen, Z., Yu, Y., Ma, W.-Y., Xi, W., and Fan, W. 2004. Optimizing web search using web click-through data. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management (CIKM '04). ACM, New York, 118--126.
[37]
Yan, R., Hauptmann, A. G., and Jin, R. 2003. Multimedia search with pseudo-relevance feedback. In Proceedings of the International Conference on Image and Video Retrieval (CIVR '03).
[38]
Yang, L., Geng, B., Hanjalic, A., and Hua, X.-S. 2010. Contextual image retrieval model. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR '10). ACM, New York, 406--413.
[39]
Yang, L. and Hanjalic, A. 2010. Supervised reranking for web image search. In Proceeding of the 18th ACM International Conference on Multimedia (MULTIMEDIA '10).
[40]
Yang, Y., Xu, D., Nie, F., Luo, J., and Zhuang, Y. 2009. Ranking with local regression and global alignment for cross media retrieval. In Proceedings of the 17th ACM International Conference on Multimedia (MULTIMEDIA '09). ACM, New York, 175--184.
[41]
Yang, Y. H., Wu, P. T., Lee, C. W., Lin, K. H., Hsu, W. H., and Chen, H. H. 2008. Contextseer: Context search and recommendation at query time for shared consumer photos. In Proceeding of the 16th ACM International Conference on Multimedia (MULTIMEDIA '08).
[42]
Zha, Z.-J., Yang, L., Mei, T., Wang, M., Wang, Z., Chua, T.-S., and Hua, X.-S. 2010. Visual query suggestion: Towards capturing user intent in internet image search. ACM Trans. Multimed. Comput. Commun. Appl. 6.
[43]
Zhai, C. 2009. Statistical Language Models for Information Retrieval. Morgan & Claypool.
[44]
Zhai, C. and Lafferty, J. 2004. A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. Sec. 22, 2, 179--214.
[45]
Zhang, H.-J. 2009. Multimedia content analysis and search: New perspectives and approaches. www.acmmm09.org/ACM MM09 Keynote.pdf.
[46]
Zhang, R., Zhang, Z. M., Li, M., Ma, W.-Y., and Zhang, H.-J. 2005. A probabilistic semantic model for image annotation and multi-modal image retrieva. In Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05). IEEE Computer Society, Los Alamitos, CA, 846--851.
[47]
Zhao, R. and Grosky, W. I. 2002. Narrowing the semantic gap - improved text-based web document retrieval using visual features. IEEE Trans. Multimed. 4, 2, 189--200.
[48]
Zhou, X. S. and Huang, T. S. 2003. Relevance feedback in image retrieval: A comprehensive review. Multimed. Syst. 8, 6, 536--544.

Cited By

View all
  • (2024)LiLTv2: Language-substitutable Layout-image Transformer for Visual Information ExtractionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/370835121:3(1-27)Online publication date: 11-Dec-2024
  • (2019)Efficient Image Hashing with Geometric Invariant Vector Distance for Copy DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/335539415:4(1-22)Online publication date: 16-Dec-2019
  • (2019)Discovering Latent Topics by Gaussian Latent Dirichlet Allocation and Spectral ClusteringACM Transactions on Multimedia Computing, Communications, and Applications10.1145/329004715:1(1-18)Online publication date: 23-Jan-2019
  • Show More Cited By

Index Terms

  1. A unified context model for web image retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 8, Issue 3
    July 2012
    143 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/2240136
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 August 2012
    Accepted: 01 May 2011
    Revised: 01 November 2010
    Received: 01 June 2010
    Published in TOMM Volume 8, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Image retrieval
    2. context-based web image retrieval

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)LiLTv2: Language-substitutable Layout-image Transformer for Visual Information ExtractionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/370835121:3(1-27)Online publication date: 11-Dec-2024
    • (2019)Efficient Image Hashing with Geometric Invariant Vector Distance for Copy DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/335539415:4(1-22)Online publication date: 16-Dec-2019
    • (2019)Discovering Latent Topics by Gaussian Latent Dirichlet Allocation and Spectral ClusteringACM Transactions on Multimedia Computing, Communications, and Applications10.1145/329004715:1(1-18)Online publication date: 23-Jan-2019
    • (2018)An ENF-Based Audio Authenticity Method Robust to MP3 CompressionCircuits, Systems, and Signal Processing10.5555/3288801.328882437:11(4973-4992)Online publication date: 1-Nov-2018
    • (2016)User Intent in Multimedia SearchACM Computing Surveys10.1145/295493049:2(1-37)Online publication date: 13-Aug-2016
    • (2015)Improving Concept-Based Image Retrieval with Training Weights Computed from TagsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/279023012:2(1-22)Online publication date: 2-Nov-2015
    • (2015)Joint Structural Learning to Rank with Deep Linear Feature LearningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.242670727:10(2756-2769)Online publication date: 1-Oct-2015

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media