Abstract
In numerous applications that deal with similarity search, a user may not have an exact specification of his information need and/or may not be able to formulate a query that exactly captures his notion of similarity. A promising approach to mitigate this problem is to enable the user to submit a rough approximation of the desired query and use relevance feedback on retrieved objects to refine the query. In this paper, we explore such a refinement strategy for a general class of structured similarity queries. Our approach casts the refinement problem as that of learning concepts using the tuples on which the user provides feedback as a labeled training set. Under this setup, similarity query refinement consists of two learning tasks: learning the structure of the query and learning the relative importance of query components. The paper develops machine learning approaches suitable for the two learning tasks. The primary contribution of the paper is the Refinement Activation Framework (RAF) that decides when each learner is invoked. Experimental analysis over many real life datasets shows that our strategy significantly outperforms existing approaches in terms of retrieval quality.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baeza-Yates, R., Ribeiro-Neto,: Modern Information Retrieval. ACM Press Series. Addison Wesley, New York (1999)
Bloedorn, E., Michalski, R.S., Wnek, J.: Multistrategy constructive induction: AQ17-MCI. In: Proc. of the 2nd Int. Workshop on Multistrategy Learning, pp. 188–203 (1993)
Clark, P., Niblett, T.: The CN2 Induction Algorithm. Machine Learning 3(4), 261–283 (1989)
Fagin, R.: Combining Fuzzy Information from Multiple Systems. In: Proc. of the 15th ACM Symp. on PODS (1996)
Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: PODS (2001)
I.: IBM linear optimization package: http://www-3.ibm.com/software/data/bi/osl/pubs/lpsol/lpuser.htm
Ishikawa, Y., Subramanya, R., Faloutsos, C.: Mindreader: Querying databases through multiple examples. In: VLDB (1998)
Mangasarian, O.L., Setiono, R., Wolberg, W.H.: Pattern recognition via linear programming: Theory and application to medical diagnosis. In: SIAM (1990)
Mehrotra, S., Rui, Y., Ortega, M., Huang, T.: Supporting content-based queries over images in mars. In: Proc. of IEEE-ICMCS 1997 (1997)
Merz, C.J., Murphy, P.: UCI Repository of Machine Learning Databases (1996), http://www.cs.uci.edu/~mlearn/MLRepository.html
Raymond, J., Mooney, R.J.: Encouraging Experimental Results on learning CNF. Machine Learning 19(1), 79–92 (1995)
Ortega, M., Rui, Y., Chakrabarti, K., Porkaew, K., Mehrotra, S., Huang, T.: Supporting ranked boolean similarity queries in mars. IEEE Trans. on Data Engineering 10(6) ( December 1998)
Ortega-Binderberger, M., Chakrabarti, K., Mehrotra, S.: An Approach to Integrating Query Refinement in SQL. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, Springer, Heidelberg (2002)
Porkaew, K., Mehrotra, S., Ortega, M., Chakrabarti, K.: Similarity search using multiple examples in mars. In: Huijsmans, D.P., Smeulders, A.W.M. (eds.) VISUAL 1999. LNCS, vol. 1614, Springer, Heidelberg (1999)
Quinlan, R.: C4.5: Program for Machine Learning. Morgan Kaufmann, San Francisco (1992)
Rocchio, J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System: Experiments in Automatic Document Processing, pp. 313–323. Prentice Hall, Englewood Cliffs (1971)
Rui, Y., Huang, T., Mehrotra, S.: Content-based image retrieval with relevance feedback in mars. In: IEEE Proc. of Int. Conf. on Image Processing (1997)
Rui, Y., Huang, T., Ortega, M., Mehrotra, S.: Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circuits and Systems for Video Technology (1998)
Salton, G.: The use of extended boolean logic in information retrieval. In: SIGMOD (1984)
Wu, L., Faloutsos, C., Sycara, K., Payne, T.: FALCON: Feedback adaptive loop for content-based retrieval. In: VLDB (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ma, Y., Mehrotra, S., Seid, D.Y., Zhong, Q. (2006). RAF: An Activation Framework for Refining Similarity Queries Using Learning Techniques. In: Li Lee, M., Tan, KL., Wuwongse, V. (eds) Database Systems for Advanced Applications. DASFAA 2006. Lecture Notes in Computer Science, vol 3882. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11733836_41
Download citation
DOI: https://doi.org/10.1007/11733836_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33337-1
Online ISBN: 978-3-540-33338-8
eBook Packages: Computer ScienceComputer Science (R0)