Abstract
This article presents an efficient and totally unsupervised content-based similarity search method for multimedia data objects represented by high-dimensional feature vectors. The assumption is that the similarity measure is applicable to feature vectors of arbitrary length. During the offline process, different sets of features are selected by a generalized version of the Laplacian Score in an unsupervised way for individual data objects in the database. Online retrieval is performed by ranking the query object in the feature spaces of candidate objects. Those candidates for which the query object is ranked highly are selected as the query results. The ranking scheme is incorporated into an automated query expansion framework to further improve the semantic quality of the search result. Extensive experiments were conducted on several datasets to show the capability of the proposed method in boosting effectiveness without losing efficiency.
- K. Bache and M. Lichman. 2013. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml.Google Scholar
- Claudio Carpineto and Giovanni Romano. 2012. A survey of automatic query expansion in information retrieval. ACM Comput. Surv. 44, 1 (2012). Google ScholarDigital Library
- Ondrej Chum, James Philbin, Josef Sivic, Michael Isard, and Andrew Zisserman. 2007. Total recall: Automatic query expansion with a generative feature model for object retrieval. In Proceedings of the 11th International Conference on Computer Vision. 1--8. Google ScholarCross Ref
- Wei Dong, Moses Charikar, and Kai Li. 2011. Efficient k-nearest neighbor graph construction for generic similarity measures. In Proceedings of the 20th International Conference on World Wide Web. 577--586. Google ScholarDigital Library
- Thomas Drugman, Mihai Gurban, and Jean-Philippe Thiran. 2007. Relevant feature selection for audio-visual speech recognition. In Proceedings of the International Workshop on Multimedia Signal Processing. 179--182. Google ScholarCross Ref
- Jennifer G. Dy, Carla E. Brodley, Avinash C. Kak, Lynn S. Broderick, and Alex M. Aisen. 2003. Unsupervised feature selection applied to content-based retrieval of lung images. IEEE Trans. Pattern Anal. Mach. Intell. 25, 3 (2003), 373--378. Google ScholarDigital Library
- M. Everingham, J. Sivic, and A. Zisserman. 2006. “Hello! My name is … Buffy”—Automatic naming of characters in TV video. In Proceedings of the British Machine Vision Conference. 899--908.Google Scholar
- Mark A. Fanty and Ronald A. Cole. 1991. Spoken letter recognition. In Advances in Neural Information Processing Systems 3. Morgan-Kaufmann, 220--226. Google ScholarDigital Library
- Bailan Feng, Juan Cao, Zhineng Chen, Yongdong Zhang, and Shouxun Lin. 2010. Multi-modal query expansion for web video search. In SIGIR. 721--722. Google ScholarDigital Library
- François Fleuret. 2004. Fast binary feature selection with conditional mutual information. J. Mach. Learn. Res. 5 (2004), 1531--1555. Google ScholarDigital Library
- Lianli Gao, Jingkuan Song, Xingyi Liu, Junming Shao, Jiajun Liu, and Jie Shao. 2015. Learning in high-dimensional multimedia data: The state of the art. Multimedia Systems (2015), 1--11. Google ScholarDigital Library
- Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity search in high dimensions via hashing. In Proceedings of the 25th International Conference on Very Large Data Bases. 518--529. Google ScholarDigital Library
- Esin Guldogan and Moncef Gabbouj. 2008. Feature selection for content-based image retrieval. Sign. Image Video Process. 2, 3 (2008), 241--250. Google ScholarCross Ref
- Xiaofei He, Deng Cai, and Partha Niyogi. 2005. Laplacian score for feature selection. In Advances in Neural Information Processing Systems 18. MIT Press, 507--514. Google ScholarDigital Library
- Laura Hollink, Guus Schreiber, and Bob Wielinga. 2006. Query expansion for image content search. Pattern Recogn. 39 (2006), 210--222.Google Scholar
- Michael E. Houle, Xiguo Ma, Vincent Oria, and Jichao Sun. 2014. Improving the quality of k-nn graphs for image databases through vector sparsification. In Proceedings of the International Conference on Multimedia Retrieval. 89--96. Google ScholarDigital Library
- Michael E. Houle and Michael Nett. 2015. Rank-based similarity search: Reducing the dimensional dependence. IEEE Trans. Pattern Anal. Mach. Intell. 37, 1 (2015), 136--150. Google ScholarCross Ref
- Michael E. Houle, Vincent Oria, Shin’ichi Satoh, and Jichao Sun. 2013. Annotation propagation in image databases using similarity graphs. TOMCCAP 10, 1 (2013), 7. Google ScholarDigital Library
- Michael E. Houle and Jun Sakuma. 2005. Fast approximate similarity search in extremely high-dimensional data sets. In Proceedings of the 21st International Conference on Data Engineering. 619--630. Google ScholarDigital Library
- Wei Jiang, Guihua Er, Qionghai Dai, and Jinwei Gu. 2006. Similarity-based online feature selection in content-based image retrieval. IEEE Trans. Image Process. 15, 3 (2006), 702--712. Google ScholarDigital Library
- David R. Karger and Matthias Ruhl. 2002. Finding nearest neighbors in growth-restricted metrics. In Proceedings of the 34th Symposium on Theory of Computing. 741--750. Google ScholarDigital Library
- Yin-Hsi Kuo, Kuan-Ting Chen, Chien-Hsing Chiang, and Winston H. Hsu. 2009. Query expansion for hash-based image object retrieval. In Proceedings of the ACM Multimedia. 65--74. Google ScholarDigital Library
- Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324. Google ScholarCross Ref
- Chee Wee Leong, Samer Hassan, Miguel E. Ruiz, and Rada Mihalcea. 2011. Improving query expansion for image retrieval via saliency and picturability. In Proceedings of the 2nd International Conference on Multilingual and Multimodal Information Access Evaluation (CLEF’11). 137--142. Google ScholarDigital Library
- Fei-Fei Li, Robert Fergus, and Pietro Perona. 2007. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Comput. Vis, Image Und. 106, 1 (2007), 59--70. Google ScholarDigital Library
- Yingfei Li, Bo Geng, Zheng-Jun Zha, Yangxi Li, Dacheng Tao, and Chao Xu. 2011a. Query expansion by spatial co-occurrence for image retrieval. In Proceedings of the ACM Multimedia. 1177--1180. Google ScholarDigital Library
- Yang Li, Fei-Fei Li, Ke Yi, Bin Yao, and Min Wang. 2011b. Flexible aggregate similarity search. In Proceedings of the International Conference on Management of Data (SIGMOD’11). 1009--1020. Google ScholarDigital Library
- Yan Liu and John R. Kender. 2004. Video feature selection using fast-converging sort-merge tree. In Proceedings of the IEEE Conference on Multimedia and Expo. 2083--2086.Google Scholar
- Ying Liu, Dengsheng Zhang, Guojun Lu, and Wei-Ying Ma. 2007. A survey of content-based image retrieval with high-level semantics. Pattern Recogn. 40, 1 (2007), 262--282. Google ScholarDigital Library
- Nhu-Van Nguyen, Alain Boucher, Jean-Marc Ogier, and Salvatore Tabbone. 2010. Clusters-based relevance feedback for CBIR: A combination of query movement and query expansion. In Proceedings of the IEEE International Conference on Computing and Communication Technologies, Research, Innovation, and Vision for the Future. 1--6. Google ScholarCross Ref
- Michael Ortega-Binderberger and Sharad Mehrotra. 2004. Relevance feedback techniques in the MARS image retrieval system. Multimedia Syst. 9, 6 (2004), 535--547. Google ScholarDigital Library
- Md. Mahmudur Rahman, Sameer Antani, and George R. Thoma. 2011. A query expansion framework in image retrieval domain based on local and global analysis. Inf. Process. Manage. 47, 5 (2011), 676--691. Google ScholarDigital Library
- Esmat Rashedi, Hossein Nezamabadi-pour, and Saeid Saryazdi. 2009. GSA: A gravitational search algorithm. Inf. Sci. 179, 13 (2009), 2232--2248. Google ScholarDigital Library
- Esmat Rashedi, Hossein Nezamabadi-pour, and Saeid Saryazdi. 2013. A simultaneous feature adaptation and feature selection method for content-based image retrieval systems. Knowl.-Based Syst. 39 (2013), 85--94. Google ScholarDigital Library
- Juha Reunanen. 2007. Model selection and assessment using cross-indexing. In Proceedings of the International Joint Conference on Neural Networks. 2581--2585. Google ScholarCross Ref
- Pasi Saari, Tuomas Eerola, and Olivier Lartillot. 2011. Generalizability and simplicity as criteria in feature selection: Application to mood classification in music. IEEE Trans. Aud. Speech Lang. Process. 19, 6 (2011), 1802--1812. Google ScholarDigital Library
- S. Srinivasan, D. Ponceleon, D. Petkovic, and M. Viswanathan. 2000. Query expansion for imperfect speech: Applications in distributed learning. In Proceedings of the IEEE Workshop on Content-based Access of Image and Video Libraries. 50--54. Google ScholarDigital Library
- Yu Sun and B. Bhanu. 2010. Image retrieval with feature selection and relevance feedback. In Proceedings of the 17th IEEE International Conference on Image Processing. 3209--3212. Google ScholarCross Ref
- Nuno Vasconcelos and Manuela Vasconcelos. 2004. Scalable discriminant feature selection for image retrieval and recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2. 770--775. Google ScholarDigital Library
- Yang Wang, Xuemin Lin, Lin Wu, and Wenjie Zhang. 2015. Effective multi-query expansions: Robust landmark retrieval. In Proceedings of the 23rd Annual ACM Conference on Multimedia Conference (MM’15). 79--88. Google ScholarDigital Library
- Yi Yang, Heng Tao Shen, Zhigang Ma, Zi Huang, and Xiaofang Zhou. 2011. l2, 1-Norm regularized discriminative feature selection for unsupervised learning. In Proceedings of the International Joint Conferences on Artificial Intelligence. 1589--1594. Google ScholarDigital Library
- Yun Zhai, Jingen Liu, and Mubarak Shah. 2006. Automatic query expansion for news video retrieval. In Proceedings of the IEEE Conference on Multimedia and Expo. 965--968. Google ScholarCross Ref
- Zheng Zhao and Huan Liu. 2007. Spectral feature selection for supervised and unsupervised learning. In Proceedings of the 24th International Conference on Machine Learning. 1151--1157. Google ScholarDigital Library
Index Terms
- Query Expansion for Content-Based Similarity Search Using Local and Global Features
Recommendations
A query term re-weighting approach using document similarity
A query term re-weighting method to reformulate textual queries is proposed.Our approach is a local query modification method.We use the information carried by the top documents in relation to each other.Query term re-weighting can applied to short ...
Query expansion based on term distribution and DBpedia features
Highlights- Query expansion based on term distribution is better than Pseudo Relevance Feedback.
AbstractQuery Expansion (QE) approaches that involve the reformulation of queries by adding new terms to the initial user query, are intended to ameliorate the vocabulary mismatch between the query keywords and the documents’ in Information ...
Lexical Co-Occurrence and Contextual Window-Based Approach with Semantic Similarity for Query Expansion
Query expansion QE is an efficient method for enhancing the efficiency of information retrieval system. In this work, we try to capture the limitations of pseudo-feedback based QE approach and propose a hybrid approach for enhancing the efficiency of ...
Comments