research-article

A unified context model for web image retrieval

Authors:

Xian-Sheng HuaAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 8, Issue 3

Article No.: 28, Pages 1 - 19

https://doi.org/10.1145/2240136.2240141

Published: 06 August 2012 Publication History

Abstract

Content-based web image retrieval based on the query-by-example (QBE) principle remains a challenging problem due to the semantic gap as well as the gap between a user's intent and the representativeness of a typical image query. In this article, we propose to address this problem by integrating query-related contextual information into an advanced query model to improve the performance of QBE-based web image retrieval. We consider both the local and global context of the query image. The local context can be inferred from the web pages and the click-through log associated with the query image, while the global context is derived from the entire corpus comprising all web images and the associated web pages. To effectively incorporate the local query context we propose a language modeling based approach to deal with the combined structured query representation from the contextual and visual information. The global query context is integrated by the multi-modal relevance model to “reconstruct” the query from the document models indexed in the corpus. In this way, the global query context is employed to address the noise or missing information in the query and its local context, so that a comprehensive and robust query model can be obtained. We evaluated the proposed approach on a representative product image dataset collected from the web and demonstrated that the inclusion of the local and global query contexts significantly improves the performance of QBE-based web image retrieval.

References

[1]

Belkin, N. J. 2008. Some(what) grand challenges for information retrieval. In Proceedings of the 30th European Conference on IR Research on Advances in Information Retrieval (ECIR'08). 1.

Digital Library

[2]

Cao, L., Luo, J., Kautz, H., and Huang, T. 2009. Image annotation within the context of personal photo collections using hierarchical event and scene models. IEEE Trans. Multimed. 11, 2, 208--219.

Digital Library

[3]

Chang, S.-F., Hsu, W., Jiang, W., Kennedy, L., Xu, D., Yanagawa, A., and Zavesky, E. 2006. Columbia University TRECVID-2006 video search and high-level feature extraction. In Proceedings of the NIST TRECVID Workshop.

[4]

Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40, 2, 1--60.

Digital Library

[5]

Dou, Z., Song, R., Wen, J.-R., and Yuan, X. 2008. Evaluating the effectiveness of personalized web search. IEEE Trans. Knowl. Data Eng. 21, 1178--1190.

Digital Library

[6]

Geng, B., Yang, L., and Xu, C. 2009. A study of language model for image retrieval. In Proceedings of the IEEE International Conference on Data Mining Workshops (ICDMW '09). IEEE Computer Society, Los Alamitos, CA, 158--163.

Digital Library

[7]

Järvelin, K. and Kekäläinen, J. 2000. Ir evaluation methods for retrieving highly relevant documents. In Proceedings of the 23th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '00). 41--48.

Digital Library

[8]

Jing, F., Li, M., Zhang, H.-J., and Zhang, B. 2005. A unified framework for image retrieval using keyword and visual features. IEEE Trans. Image Process.

Digital Library

[9]

Lafferty, J. and Zhai, C. 2001. Document language models, query models, and risk minimization for information retrieval. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '01). ACM, New York, 111--119.

Digital Library

[10]

Lavrenko, V., Choquette, M., and Croft, W. B. 2002. Cross-lingual relevance models. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '02). ACM, New York, 175--182.

Digital Library

[11]

Lavrenko, V. and Croft, W. B. 2001. Relevance based language models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '01). ACM, New York, 120--127.

Digital Library

[12]

Lawrence, S. 2000. Context in web search. IEEE Data Eng. Bull. 23, 3, 25--32.

[13]

Li, D., Yang, L., Hua, X.-S., and Zhang, H.-J. 2010. Large-scale robust visual codebook construction. In Proceedings of the 18th ACM International Conference on Multimedia (MULTIMEDIA '10).

Digital Library

[14]

Luo, J., Hanjalic, A., Tian, Q., and Jaimes, A. 2009. Integration of context and content for multimedia management: An introduction to the special issue. IEEE Trans. Multimed. 11, 2(Jan.), 193--195.

Digital Library

[15]

Manning, C. D., Raghavan, P., and Schütze, H. 2008. Introduction to Information Retrieval. Cambridge University Press.

Digital Library

[16]

Mei, T., Hua, X.-S., and Li, S. 2008a. Contextual in-image advertising. In Proceeding of the 16th ACM International Conference on Multimedia (MULTIMEDIA '08). ACM, New York, 439--448.

Digital Library

[17]

Mei, T., Zha, Z.-J., Liu, Y., Wang, M., Qi, G.-J., Tian, X., Wang, J., Yang, L., and Hua, X.-S. 2008b. MSRA at TRECVID 2008: High-level feature extraction and automatic search. In Proceedings of TRECVID.

[18]

Page, L., Brin, S., Motwani, R., and Winograd, T. 1998. The pagerank citation ranking: Bringing order to the web. Tech. rep., Stanford Digital Library Technologies Project.

[19]

Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. 2007. Object retrieval with large vocabularies and fast spatial matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '07). 1--8.

[20]

Robertson, S. and Zaragoza, H. 2009. The probabilistic relevance framework: Bm25 and beyond. Found. Trends Inf. Retriev. 3, 333--389.

Digital Library

[21]

Robertson, S., Zaragoza, H., and Taylor, M. 2004. Simple bm25 extension to multiple weighted fields. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management (CIKM '04). ACM, New York, 42--49.

Digital Library

[22]

Rui, Y., Huang, T., Ortega, M., and Mehrotra, S. 2002. Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circ. Syst. Video Tech. 8, 5, 644--655.

Digital Library

[23]

Scholer, F. and Williams, H. E. 2002. Query association for effective retrieval. In Proceedings of the 11th International Conference on Information and Knowledge Management (CIKM '02). ACM, New York, 324--331.

Digital Library

[24]

Scholer, F., Williams, H. E., and Turpin, A. 2004. Query association surrogates for web search: Research articles. J. Amer. Soc. Info. Sci. Technol. 55, 7, 637--650.

Digital Library

[25]

Shen, H., Ooi, B., and Tan, K. 2000. Giving meanings to www images. In Proceedings of the 8th ACM International Conference on Multimedia (MULTIMEDIA '00). 39--47.

Digital Library

[26]

Sinha, P. and Jain, R. 2008. Semantics in digital photos: A contextual analysis. In Proceedings of the IEEE International Conference on Semantic Computing (ICSC '08). IEEE Computer Society, Los Alamitos, CA, 58--65.

Digital Library

[27]

Sivic, J. and Zisserman, A. 2003. Video google: A text retrieval approach to object matching in videos. In Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV '03). IEEE Computer Society, Los Alamitos, CA, 1470.

Digital Library

[28]

Smeulders, A. W. M., Worring, M., Santini, S., Gupta, A., and Jain, R. 2000. Content-based image retrieval at the end of the early years. IEEE Trans. Patt. Anal. Machine Intel. 22, 12, 1349--1380.

Digital Library

[29]

Snoek, C. G. M. and Worring, M. 2009. Concept-based video retrieval. Found. Trends Inf. Retr. 4, 2, 215--322.

Digital Library

[30]

Tian, X., Yang, L., Wang, J., Yang, Y., Wu, X., and Hua, X.-S. 2008. Bayesian video search reranking. In Proceeding of the 16th ACM International Conference on Multimedia (MULTIMEDIA '08).

Digital Library

[31]

Vapnik, V. N. 1998. Statistical Learning Theory. Wiley.

[32]

Wang, X.-J., Ma, W.-Y., Xue, G.-R., and Li, X. 2004. Multi-model similarity propagation and its application for web image retrieval. In Proceedings of the 12th Annual ACM International Conference on Multimedia (MULTIMEDIA '04). ACM, New York, 944--951.

Digital Library

[33]

Wang, X. J., Zhang, L., Li, X., and Ma, W. Y. 2008. Annotating images by mining image search results. IEEE Trans. Patt. Anal. Mach. Intell. 30, 11, 1919--1932.

Digital Library

[34]

Westerveld, T., Vries, A. P. D., van Ballegooij, A., de Jong, F., and Hiemstra, D. 2003. A probabilistic multimedia retrieval model and its evaluation. EURASIP J. Appl. Signal Process. 2003, 186--198.

Digital Library

[35]

Wilkinson, R. 1994. Effective retrieval of structured documents. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '94). Springer-Verlag, Berlin, 311--317.

Digital Library

[36]

Xue, G.-R., Zeng, H.-J., Chen, Z., Yu, Y., Ma, W.-Y., Xi, W., and Fan, W. 2004. Optimizing web search using web click-through data. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management (CIKM '04). ACM, New York, 118--126.

Digital Library

[37]

Yan, R., Hauptmann, A. G., and Jin, R. 2003. Multimedia search with pseudo-relevance feedback. In Proceedings of the International Conference on Image and Video Retrieval (CIVR '03).

Digital Library

[38]

Yang, L., Geng, B., Hanjalic, A., and Hua, X.-S. 2010. Contextual image retrieval model. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR '10). ACM, New York, 406--413.

Digital Library

[39]

Yang, L. and Hanjalic, A. 2010. Supervised reranking for web image search. In Proceeding of the 18th ACM International Conference on Multimedia (MULTIMEDIA '10).

Digital Library

[40]

Yang, Y., Xu, D., Nie, F., Luo, J., and Zhuang, Y. 2009. Ranking with local regression and global alignment for cross media retrieval. In Proceedings of the 17th ACM International Conference on Multimedia (MULTIMEDIA '09). ACM, New York, 175--184.

Digital Library

[41]

Yang, Y. H., Wu, P. T., Lee, C. W., Lin, K. H., Hsu, W. H., and Chen, H. H. 2008. Contextseer: Context search and recommendation at query time for shared consumer photos. In Proceeding of the 16th ACM International Conference on Multimedia (MULTIMEDIA '08).

Digital Library

[42]

Zha, Z.-J., Yang, L., Mei, T., Wang, M., Wang, Z., Chua, T.-S., and Hua, X.-S. 2010. Visual query suggestion: Towards capturing user intent in internet image search. ACM Trans. Multimed. Comput. Commun. Appl. 6.

Digital Library

[43]

Zhai, C. 2009. Statistical Language Models for Information Retrieval. Morgan & Claypool.

Digital Library

[44]

Zhai, C. and Lafferty, J. 2004. A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. Sec. 22, 2, 179--214.

Digital Library

[45]

Zhang, H.-J. 2009. Multimedia content analysis and search: New perspectives and approaches. www.acmmm09.org/ACM MM09 Keynote.pdf.

Digital Library

[46]

Zhang, R., Zhang, Z. M., Li, M., Ma, W.-Y., and Zhang, H.-J. 2005. A probabilistic semantic model for image annotation and multi-modal image retrieva. In Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05). IEEE Computer Society, Los Alamitos, CA, 846--851.

Digital Library

[47]

Zhao, R. and Grosky, W. I. 2002. Narrowing the semantic gap - improved text-based web document retrieval using visual features. IEEE Trans. Multimed. 4, 2, 189--200.

Digital Library

[48]

Zhou, X. S. and Huang, T. S. 2003. Relevance feedback in image retrieval: A comprehensive review. Multimed. Syst. 8, 6, 536--544.

Cited By

Wang JLin ZHuang DXiong LJin L(2024)LiLTv2: Language-substitutable Layout-image Transformer for Visual Information ExtractionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/370835121:3(1-27)Online publication date: 11-Dec-2024
https://dl.acm.org/doi/10.1145/3708351
Liu SHuang Z(2019)Efficient Image Hashing with Geometric Invariant Vector Distance for Copy DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/335539415:4(1-22)Online publication date: 16-Dec-2019
https://dl.acm.org/doi/10.1145/3355394
Yuan BGao XNiu ZTian Q(2019)Discovering Latent Topics by Gaussian Latent Dirichlet Allocation and Spectral ClusteringACM Transactions on Multimedia Computing, Communications, and Applications10.1145/329004715:1(1-18)Online publication date: 23-Jan-2019
https://dl.acm.org/doi/10.1145/3290047
Show More Cited By

Index Terms

A unified context model for web image retrieval
1. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Contextual image retrieval model
CIVR '10: Proceedings of the ACM International Conference on Image and Video Retrieval

A state-of-the-art query-by-region image retrieval method typically works as follows. Firstly, the user provides a query image and draws a bounding box to specify the region of interest (ROI). Then the visual words extracted from within the bounding box ...
Web image retrieval systems with automatic web image annotating techniques

Due to the popularity of digital cameras and web authors'enriching the visual aesthetics, the number of web images is growing in an uncontrolled speed. The images in the World Wide Web are becoming a large image library for browsing. It is an important ...
Web Image Retrieval for Abstract Queries Using Text and Image Information
AIRS '09: Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology

In this paper, we propose a method for image retrieval on the web. In this task, we focus on abstract words that do not directly link to images that we want. For example, a user might use a query "summer" to retrieve images of "fireworks" or "a white ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 8, Issue 3

July 2012

143 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/2240136

Issue’s Table of Contents

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 August 2012

Accepted: 01 May 2011

Revised: 01 November 2010

Received: 01 June 2010

Published in TOMM Volume 8, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
466
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang JLin ZHuang DXiong LJin L(2024)LiLTv2: Language-substitutable Layout-image Transformer for Visual Information ExtractionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/370835121:3(1-27)Online publication date: 11-Dec-2024
https://dl.acm.org/doi/10.1145/3708351
Liu SHuang Z(2019)Efficient Image Hashing with Geometric Invariant Vector Distance for Copy DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/335539415:4(1-22)Online publication date: 16-Dec-2019
https://dl.acm.org/doi/10.1145/3355394
Yuan BGao XNiu ZTian Q(2019)Discovering Latent Topics by Gaussian Latent Dirichlet Allocation and Spectral ClusteringACM Transactions on Multimedia Computing, Communications, and Applications10.1145/329004715:1(1-18)Online publication date: 23-Jan-2019
https://dl.acm.org/doi/10.1145/3290047
Zinemanas PFuentes MCancela PApolinário J(2018)An ENF-Based Audio Authenticity Method Robust to MP3 CompressionCircuits, Systems, and Signal Processing10.5555/3288801.328882437:11(4973-4992)Online publication date: 1-Nov-2018
https://dl.acm.org/doi/10.5555/3288801.3288824
Kofler CLarson MHanjalic A(2016)User Intent in Multimedia SearchACM Computing Surveys10.1145/295493049:2(1-37)Online publication date: 13-Aug-2016
https://dl.acm.org/doi/10.1145/2954930
Papapanagiotou VDiou CDelopoulos A(2015)Improving Concept-Based Image Retrieval with Training Weights Computed from TagsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/279023012:2(1-22)Online publication date: 2-Nov-2015
https://dl.acm.org/doi/10.1145/2790230
Zhao XLi XZhang Z(2015)Joint Structural Learning to Rank with Deep Linear Feature LearningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.242670727:10(2756-2769)Online publication date: 1-Oct-2015
https://dl.acm.org/doi/10.1109/TKDE.2015.2426707

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents