Query Expansion for Content-Based Similarity Search Using Local and Global Features

Authors:
Michael E. Houle

National Institute of Informatics, Tokyo, Japan

National Institute of Informatics, Tokyo, Japan
View Profile

,
Xiguo Ma

Google Mountain View, USA

Google Mountain View, USA
View Profile

,
Vincent Oria

New Jersey Institute of Technology, Newark, NJ, USA

New Jersey Institute of Technology, Newark, NJ, USA
View Profile

,
Jichao Sun

New Jersey Institute of Technology, Newark, NJ, USA

New Jersey Institute of Technology, Newark, NJ, USA
View Profile

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 13 Issue 3Article No.: 25pp 1–23https://doi.org/10.1145/3063595

Published:31 May 2017Publication History

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

This article presents an efficient and totally unsupervised content-based similarity search method for multimedia data objects represented by high-dimensional feature vectors. The assumption is that the similarity measure is applicable to feature vectors of arbitrary length. During the offline process, different sets of features are selected by a generalized version of the Laplacian Score in an unsupervised way for individual data objects in the database. Online retrieval is performed by ranking the query object in the feature spaces of candidate objects. Those candidates for which the query object is ranked highly are selected as the query results. The ranking scheme is incorporated into an automated query expansion framework to further improve the semantic quality of the search result. Extensive experiments were conducted on several datasets to show the capability of the proposed method in boosting effectiveness without losing efficiency.

References

K. Bache and M. Lichman. 2013. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml.Google Scholar
Claudio Carpineto and Giovanni Romano. 2012. A survey of automatic query expansion in information retrieval. ACM Comput. Surv. 44, 1 (2012). Google ScholarDigital Library
Ondrej Chum, James Philbin, Josef Sivic, Michael Isard, and Andrew Zisserman. 2007. Total recall: Automatic query expansion with a generative feature model for object retrieval. In Proceedings of the 11th International Conference on Computer Vision. 1--8. Google ScholarCross Ref
Wei Dong, Moses Charikar, and Kai Li. 2011. Efficient k-nearest neighbor graph construction for generic similarity measures. In Proceedings of the 20th International Conference on World Wide Web. 577--586. Google ScholarDigital Library
Thomas Drugman, Mihai Gurban, and Jean-Philippe Thiran. 2007. Relevant feature selection for audio-visual speech recognition. In Proceedings of the International Workshop on Multimedia Signal Processing. 179--182. Google ScholarCross Ref
Jennifer G. Dy, Carla E. Brodley, Avinash C. Kak, Lynn S. Broderick, and Alex M. Aisen. 2003. Unsupervised feature selection applied to content-based retrieval of lung images. IEEE Trans. Pattern Anal. Mach. Intell. 25, 3 (2003), 373--378. Google ScholarDigital Library
M. Everingham, J. Sivic, and A. Zisserman. 2006. “Hello&excl; My name is … Buffy”—Automatic naming of characters in TV video. In Proceedings of the British Machine Vision Conference. 899--908.Google Scholar
Mark A. Fanty and Ronald A. Cole. 1991. Spoken letter recognition. In Advances in Neural Information Processing Systems 3. Morgan-Kaufmann, 220--226. Google ScholarDigital Library
Bailan Feng, Juan Cao, Zhineng Chen, Yongdong Zhang, and Shouxun Lin. 2010. Multi-modal query expansion for web video search. In SIGIR. 721--722. Google ScholarDigital Library
François Fleuret. 2004. Fast binary feature selection with conditional mutual information. J. Mach. Learn. Res. 5 (2004), 1531--1555. Google ScholarDigital Library
Lianli Gao, Jingkuan Song, Xingyi Liu, Junming Shao, Jiajun Liu, and Jie Shao. 2015. Learning in high-dimensional multimedia data: The state of the art. Multimedia Systems (2015), 1--11. Google ScholarDigital Library
Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity search in high dimensions via hashing. In Proceedings of the 25th International Conference on Very Large Data Bases. 518--529. Google ScholarDigital Library
Esin Guldogan and Moncef Gabbouj. 2008. Feature selection for content-based image retrieval. Sign. Image Video Process. 2, 3 (2008), 241--250. Google ScholarCross Ref
Xiaofei He, Deng Cai, and Partha Niyogi. 2005. Laplacian score for feature selection. In Advances in Neural Information Processing Systems 18. MIT Press, 507--514. Google ScholarDigital Library
Laura Hollink, Guus Schreiber, and Bob Wielinga. 2006. Query expansion for image content search. Pattern Recogn. 39 (2006), 210--222.Google Scholar
Michael E. Houle, Xiguo Ma, Vincent Oria, and Jichao Sun. 2014. Improving the quality of k-nn graphs for image databases through vector sparsification. In Proceedings of the International Conference on Multimedia Retrieval. 89--96. Google ScholarDigital Library
Michael E. Houle and Michael Nett. 2015. Rank-based similarity search: Reducing the dimensional dependence. IEEE Trans. Pattern Anal. Mach. Intell. 37, 1 (2015), 136--150. Google ScholarCross Ref
Michael E. Houle, Vincent Oria, Shin’ichi Satoh, and Jichao Sun. 2013. Annotation propagation in image databases using similarity graphs. TOMCCAP 10, 1 (2013), 7. Google ScholarDigital Library
Michael E. Houle and Jun Sakuma. 2005. Fast approximate similarity search in extremely high-dimensional data sets. In Proceedings of the 21st International Conference on Data Engineering. 619--630. Google ScholarDigital Library
Wei Jiang, Guihua Er, Qionghai Dai, and Jinwei Gu. 2006. Similarity-based online feature selection in content-based image retrieval. IEEE Trans. Image Process. 15, 3 (2006), 702--712. Google ScholarDigital Library
David R. Karger and Matthias Ruhl. 2002. Finding nearest neighbors in growth-restricted metrics. In Proceedings of the 34th Symposium on Theory of Computing. 741--750. Google ScholarDigital Library
Yin-Hsi Kuo, Kuan-Ting Chen, Chien-Hsing Chiang, and Winston H. Hsu. 2009. Query expansion for hash-based image object retrieval. In Proceedings of the ACM Multimedia. 65--74. Google ScholarDigital Library
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324. Google ScholarCross Ref
Chee Wee Leong, Samer Hassan, Miguel E. Ruiz, and Rada Mihalcea. 2011. Improving query expansion for image retrieval via saliency and picturability. In Proceedings of the 2nd International Conference on Multilingual and Multimodal Information Access Evaluation (CLEF’11). 137--142. Google ScholarDigital Library
Fei-Fei Li, Robert Fergus, and Pietro Perona. 2007. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Comput. Vis, Image Und. 106, 1 (2007), 59--70. Google ScholarDigital Library
Yingfei Li, Bo Geng, Zheng-Jun Zha, Yangxi Li, Dacheng Tao, and Chao Xu. 2011a. Query expansion by spatial co-occurrence for image retrieval. In Proceedings of the ACM Multimedia. 1177--1180. Google ScholarDigital Library
Yang Li, Fei-Fei Li, Ke Yi, Bin Yao, and Min Wang. 2011b. Flexible aggregate similarity search. In Proceedings of the International Conference on Management of Data (SIGMOD’11). 1009--1020. Google ScholarDigital Library
Yan Liu and John R. Kender. 2004. Video feature selection using fast-converging sort-merge tree. In Proceedings of the IEEE Conference on Multimedia and Expo. 2083--2086.Google Scholar
Ying Liu, Dengsheng Zhang, Guojun Lu, and Wei-Ying Ma. 2007. A survey of content-based image retrieval with high-level semantics. Pattern Recogn. 40, 1 (2007), 262--282. Google ScholarDigital Library
Nhu-Van Nguyen, Alain Boucher, Jean-Marc Ogier, and Salvatore Tabbone. 2010. Clusters-based relevance feedback for CBIR: A combination of query movement and query expansion. In Proceedings of the IEEE International Conference on Computing and Communication Technologies, Research, Innovation, and Vision for the Future. 1--6. Google ScholarCross Ref
Michael Ortega-Binderberger and Sharad Mehrotra. 2004. Relevance feedback techniques in the MARS image retrieval system. Multimedia Syst. 9, 6 (2004), 535--547. Google ScholarDigital Library
Md. Mahmudur Rahman, Sameer Antani, and George R. Thoma. 2011. A query expansion framework in image retrieval domain based on local and global analysis. Inf. Process. Manage. 47, 5 (2011), 676--691. Google ScholarDigital Library
Esmat Rashedi, Hossein Nezamabadi-pour, and Saeid Saryazdi. 2009. GSA: A gravitational search algorithm. Inf. Sci. 179, 13 (2009), 2232--2248. Google ScholarDigital Library
Esmat Rashedi, Hossein Nezamabadi-pour, and Saeid Saryazdi. 2013. A simultaneous feature adaptation and feature selection method for content-based image retrieval systems. Knowl.-Based Syst. 39 (2013), 85--94. Google ScholarDigital Library
Juha Reunanen. 2007. Model selection and assessment using cross-indexing. In Proceedings of the International Joint Conference on Neural Networks. 2581--2585. Google ScholarCross Ref
Pasi Saari, Tuomas Eerola, and Olivier Lartillot. 2011. Generalizability and simplicity as criteria in feature selection: Application to mood classification in music. IEEE Trans. Aud. Speech Lang. Process. 19, 6 (2011), 1802--1812. Google ScholarDigital Library
S. Srinivasan, D. Ponceleon, D. Petkovic, and M. Viswanathan. 2000. Query expansion for imperfect speech: Applications in distributed learning. In Proceedings of the IEEE Workshop on Content-based Access of Image and Video Libraries. 50--54. Google ScholarDigital Library
Yu Sun and B. Bhanu. 2010. Image retrieval with feature selection and relevance feedback. In Proceedings of the 17th IEEE International Conference on Image Processing. 3209--3212. Google ScholarCross Ref
Nuno Vasconcelos and Manuela Vasconcelos. 2004. Scalable discriminant feature selection for image retrieval and recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2. 770--775. Google ScholarDigital Library
Yang Wang, Xuemin Lin, Lin Wu, and Wenjie Zhang. 2015. Effective multi-query expansions: Robust landmark retrieval. In Proceedings of the 23rd Annual ACM Conference on Multimedia Conference (MM’15). 79--88. Google ScholarDigital Library
Yi Yang, Heng Tao Shen, Zhigang Ma, Zi Huang, and Xiaofang Zhou. 2011. l_{2, 1}-Norm regularized discriminative feature selection for unsupervised learning. In Proceedings of the International Joint Conferences on Artificial Intelligence. 1589--1594. Google ScholarDigital Library
Yun Zhai, Jingen Liu, and Mubarak Shah. 2006. Automatic query expansion for news video retrieval. In Proceedings of the IEEE Conference on Multimedia and Expo. 965--968. Google ScholarCross Ref
Zheng Zhao and Huan Liu. 2007. Spectral feature selection for supervised and unsupervised learning. In Proceedings of the 24th International Conference on Machine Learning. 1151--1157. Google ScholarDigital Library

Index Terms

Query Expansion for Content-Based Similarity Search Using Local and Global Features
1. Information systems
  1. Information systems applications
    1. Multimedia information systems
      1. Multimedia databases

Recommendations

A query term re-weighting approach using document similarity

A query term re-weighting method to reformulate textual queries is proposed.Our approach is a local query modification method.We use the information carried by the top documents in relation to each other.Query term re-weighting can applied to short ...
Read More
Query expansion based on term distribution and DBpedia features
Highlights
- Query expansion based on term distribution is better than Pseudo Relevance Feedback.
Abstract
Query Expansion (QE) approaches that involve the reformulation of queries by adding new terms to the initial user query, are intended to ameliorate the vocabulary mismatch between the query keywords and the documents’ in Information ...
Read More
Lexical Co-Occurrence and Contextual Window-Based Approach with Semantic Similarity for Query Expansion

Query expansion QE is an efficient method for enhancing the efficiency of information retrieval system. In this work, we try to capture the limitations of pseudo-feedback based QE approach and propose a hybrid approach for enhancing the efficiency of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Multimedia Computing, Communications, and Applications Volume 13, Issue 3
August 2017
233 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3104033
Editor:
Alberto Del Bimbo
University of Firenze, Italy
Issue’s Table of Contents
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 31 May 2017
- Accepted: 1 February 2017
- Revised: 1 January 2017
- Received: 1 August 2016
Published in tomm Volume 13, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Content-based similarity search
flexible aggregation
query expansion
subjective feature space
unsupervised feature selection
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 311
  Total Downloads
- Downloads (Last 12 months)24
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Query Expansion for Content-Based Similarity Search Using Local and Global Features

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

References

Cited By

Index Terms

Recommendations

A query term re-weighting approach using document similarity

Query expansion based on term distribution and DBpedia features

Lexical Co-Occurrence and Contextual Window-Based Approach with Semantic Similarity for Query Expansion