Abstract
Internet users may suffer the empty or too little answer problem when they post a strict query to the Web database. To address this problem, we develop a general framework to enable automatically query relaxation and top-k result ranking. Our framework consists of two processing steps. The first step is query relaxation. Based on the user original query, we speculate how much the user cares about each specified attribute by measuring its specified value distribution in the database. The rare distribution of the specified value of the attribute indicates the attribute may important for the user. According to the attribute importance, the original query is then rewritten as a relaxed query by expanding each query criterion range. The relaxed degree on each specified attribute is varied with the attribute weight adaptively. The most important attribute is relaxed with the minimum degree so that the answer returned by the relaxed query can be most relevant to the user original intention. The second step is top-k result ranking. In this step, we first generate user contextual preferences from query history and then use them to create a priori orders of tuples during the off-line pre-processing. Only a few representative orders are saved, each corresponding to a set of contexts. Then, these orders and associated contexts are used at querying time to expeditiously provide top-k relevant answers by using the top-k evaluation algorithm. Results of a preliminary user study demonstrate our query relaxation, and top-k result ranking methods can capture the users preferences effectively. The efficiency and effectiveness of our approach is also demonstrated.












Similar content being viewed by others
References
Agrawal S, Chaudhuri S, Das G, Gionis A (2003) Automated ranking of database query results. ACM Trans Database Syst 28(2):140–174
Agrawal R, Rantzau R (2006) Context-sensitive ranking. In: Proceedings of the ACM SIGMOD international conference on management of data, Chicago, USA, pp 383–394
Agarwal G, Mallick N, Turuvekere S (2008) Ranking database queries with user feedback: a neural network approach. In: Proceedings of the international conference on Database systems for advanced applications, New Delhi, India, pp 424–431
Altman A, Tennenholtz AM (2010) An axiomatic approach to personalized ranking systems. J ACM 57(4):1–35
Amo S, Diallo MS, Diop CT (2015) Contextual preference mining for user profile construction. Inf Syst 49:182–199
Bosc P, HadjAli A, Pivert O (2008) Empty versus overabundant answers to flexible relational queries. Fuzzy Sets Syst 159(12):1450–1467
Boriah S, Chandola V, Kumar V (2008) Similarity measures for categorical data: a comparative evaluation. In: Proceedings of the SIAM international conference on data mining, Atlanta, USA, pp 243–254
Chomicki J (2003) Preference formulas in relational queries. ACM Trans Database Syst 28(4):427–466
Chakrabarti K, Chaudhuri S, Hwang S (2004) Automatic categorization of query results. In: Proceedings of the ACM SIGMOD international conference on data management, Paris, France, pp 755–766
Chrobak M, Keynon C, Young N (2005) The reverse greedy algorithm for the metric \(k\)-median problem. Inf Process Lett 97:68–72
Chaudhuri S, Das G, Hristidis V (2006) Probabilistic information retrieval approach for ranking of database query results. ACM Trans Database Syst 31(3):1134–1168
Chen ZY, Li T (2007) Addressing diverse user preferences in SQL-Query-Result navigation. In: Proceedings of the ACM SIGMOD international conference on data management, Beijing, China, pp 641–652
Chakrabarti K, Ganti V, Han J (2009) Ranking objects based on relationships. In: Proceedings of the international conference on extending database technology, Saint-Petersburg, Russia, pp 910–921
Cao HP, Qi Y, Candan S (2010) Feedback-driven result ranking and query refinement for exploring semi-structured data collections. In: Proceedings of the international conference on extending database technology, Lausanne, Switzerland, pp 3–14
Chen LJ, Papakonstantinou Y (2011) Context-sensitive ranking for document retrieval. In: Proceedings of the ACM SIGMOD international conference on management of data, Athens, Greece, pp 757–768
Dallachiesa M, Palpanas T, Ilyas IF (2014) Top-\(k\) nearest neighbor search in uncertain data series. PVLDB 8(1):13–24
Friedman N, Goldszmidt M, Lee TJ (1998) Bayesian network classification with continuous attributes: getting the best of both discretization and parametric fitting. In: Proceedings of the international conference on machine learning, Wisconsin, USA, pp 179–187
Fagin R, Lotem A, Naor M (2001) Optimal aggregation algorithms for middleware. In: Proceedings of the symposium on principles of database systems, Santa Barbara, USA, pp 102–113
Gan G, Ma C (2007) Data clustering: theory, algorithms, and applications. Soc Ind Appl Math 20(8):44–51
Jie L, Lamkhede S, Sapra R (2013) A unified search federation system based on online user feedback. In: Proceedings of the ACM SIGKDD conference on knowledge discovery and data mining, Chicago, USA, pp 1195–1203
Jiang MH, Fu AW, Wong RC (2015) Exact top-\(k\) nearest keyword search in large networks. In: Proceedings of the ACM SIGMOD international conference on management of data, Melbourne, Australia, pp 393–404
Kiebling W (2002) Foundations of preferences in database systems. In: Proceedings of the international conference on very large data bases, Hong Kong, China, pp 311–322
Koutrika G, Ioannidis YE (2005) Constrained optimalities in query personalization. In: Proceedings of the ACM SIGMOD international conference on management of data, Baltimore, USA, pp 73–84
Koutrika G, Ioannidis YE (2005) Personalized queries under a generalized preference model. In: Proceedings of the international conference on data engineering, Tokyo, Japan, pp 841–852
Muslea I, Lee TJ (2005) Online query relaxation via Bayesian causal structures discovery. In: Proceedings of the 20th artificial intelligence conference, Pittsburgh, USA, pp 831–836
Ma ZM, Yan L (2007) Generalization of strategies for fuzzy query translation in classical relational databases. Inf Softw Technol 49(2):172–180
Meng XF, Ma ZM, Yan L (2009) Answering approximate queries over autonomous web databases. In: Proceedings of the 18th international world wide web conference, Madrid, Spain, pp 1021–1030
Miele A, Quintarelli E, Rabosio E (2013) A data-mining approach to preference-based data ranking founded on contextual information. Inf Syst 38(4):524–544
Mottin D, Marascu A, Roy SB (2014) IQR: an interactive query relaxation system for the empty-answer problem. In: Proceedings of the ACM SIGMOD international conference on data management, Snowbird, USA, pp 1095–1098
Martinenghi D, Torlone R (2014) Taxonomy-based relaxation of query answering in relational databases. J VLDB 23(5):747–769
Miele A, Quintarelli E, Rabosio E (2014) ADaPT: automatic data personalization based on contextual preferences. In: Proceedings of the international conference on data engineering, Chicago, USA, pp 1234–1237
Nambiar U, Kambhampati S (2006) Answering imprecise queries over autonomous web databases. In: Proceedings of the international conference on data engineering, Atlanta, USA, pp 45–54
Nguyen K, Cao JL (2012) Top-\(k\) answers for XML keyword queries. In: Proceedings of the international world wide web conference, Lyon, France, pp 485–515
Su W, Wang J, Huang Q, Lochovsky F (2006) Query result ranking over e-commerce web databases. In: Proceedings of the 15th ACM conference on information and knowledge management, Kansas City, USA, pp 575–584
Stefanidis K, Koutrika G, Pitoura E (2011) A survey on representation, composition and application of preferences in database systems. ACM Trans Database syst 36(3):1–45
Santhanam GR, Basu S, Honavar V (2011) Representing and reasoning with qualitative preferences for compositional systems. J Artif Intell Res 42(1):211–274
Telang A, Chakravarthy S, Li C (2013) Personalized ranking in web databases: establishing and utilizing an appropriate workload. Distrib Parallel Databases 31(1):47–70
Tao WB, Yu MH, Li GL (2001) Efficient top-\(k\) simrank-based similarity join. PVLDB 8(3):317–328
Wang C, Cao LB, Wang MC (2011) Coupled nominal similarity in unsupervised learning. In: Proceedings of the 20th ACM conference on information and knowledge management, Glasgow, UK, pp 973–978
Yager RR (2010) Soft querying of standard and uncertain databases. IEEE Trans Fuzzy Syst 18(2):336–347
Yu A, Agarwal PK, Yang J (2012) Processing a large number of continuous preference top-\(k\) queries. In: Proceedings of the ACM SIGMOD international conference on management of data, Scottsdale, USA, pp 397–408
Acknowledgments
This work is supported by the National Science Foundation for Young Scientists of China (No.61003162) and the Young Scholars Growth Plan of Liaoning (No. LJQ2013038).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Meng, X., Zhang, X., Tang, Y. et al. Adaptive query relaxation and top-k result ranking over autonomous web databases. Knowl Inf Syst 51, 395–433 (2017). https://doi.org/10.1007/s10115-016-0982-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-016-0982-4