Skip to main content
Log in

Adaptive query relaxation and top-k result ranking over autonomous web databases

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Internet users may suffer the empty or too little answer problem when they post a strict query to the Web database. To address this problem, we develop a general framework to enable automatically query relaxation and top-k result ranking. Our framework consists of two processing steps. The first step is query relaxation. Based on the user original query, we speculate how much the user cares about each specified attribute by measuring its specified value distribution in the database. The rare distribution of the specified value of the attribute indicates the attribute may important for the user. According to the attribute importance, the original query is then rewritten as a relaxed query by expanding each query criterion range. The relaxed degree on each specified attribute is varied with the attribute weight adaptively. The most important attribute is relaxed with the minimum degree so that the answer returned by the relaxed query can be most relevant to the user original intention. The second step is top-k result ranking. In this step, we first generate user contextual preferences from query history and then use them to create a priori orders of tuples during the off-line pre-processing. Only a few representative orders are saved, each corresponding to a set of contexts. Then, these orders and associated contexts are used at querying time to expeditiously provide top-k relevant answers by using the top-k evaluation algorithm. Results of a preliminary user study demonstrate our query relaxation, and top-k result ranking methods can capture the users preferences effectively. The efficiency and effectiveness of our approach is also demonstrated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Agrawal S, Chaudhuri S, Das G, Gionis A (2003) Automated ranking of database query results. ACM Trans Database Syst 28(2):140–174

    Article  Google Scholar 

  2. Agrawal R, Rantzau R (2006) Context-sensitive ranking. In: Proceedings of the ACM SIGMOD international conference on management of data, Chicago, USA, pp 383–394

  3. Agarwal G, Mallick N, Turuvekere S (2008) Ranking database queries with user feedback: a neural network approach. In: Proceedings of the international conference on Database systems for advanced applications, New Delhi, India, pp 424–431

  4. Altman A, Tennenholtz AM (2010) An axiomatic approach to personalized ranking systems. J ACM 57(4):1–35

    Article  MathSciNet  MATH  Google Scholar 

  5. Amo S, Diallo MS, Diop CT (2015) Contextual preference mining for user profile construction. Inf Syst 49:182–199

    Article  Google Scholar 

  6. Bosc P, HadjAli A, Pivert O (2008) Empty versus overabundant answers to flexible relational queries. Fuzzy Sets Syst 159(12):1450–1467

    Article  MathSciNet  MATH  Google Scholar 

  7. Boriah S, Chandola V, Kumar V (2008) Similarity measures for categorical data: a comparative evaluation. In: Proceedings of the SIAM international conference on data mining, Atlanta, USA, pp 243–254

  8. Chomicki J (2003) Preference formulas in relational queries. ACM Trans Database Syst 28(4):427–466

    Article  MathSciNet  Google Scholar 

  9. Chakrabarti K, Chaudhuri S, Hwang S (2004) Automatic categorization of query results. In: Proceedings of the ACM SIGMOD international conference on data management, Paris, France, pp 755–766

  10. Chrobak M, Keynon C, Young N (2005) The reverse greedy algorithm for the metric \(k\)-median problem. Inf Process Lett 97:68–72

    Article  MathSciNet  MATH  Google Scholar 

  11. Chaudhuri S, Das G, Hristidis V (2006) Probabilistic information retrieval approach for ranking of database query results. ACM Trans Database Syst 31(3):1134–1168

    Article  Google Scholar 

  12. Chen ZY, Li T (2007) Addressing diverse user preferences in SQL-Query-Result navigation. In: Proceedings of the ACM SIGMOD international conference on data management, Beijing, China, pp 641–652

  13. Chakrabarti K, Ganti V, Han J (2009) Ranking objects based on relationships. In: Proceedings of the international conference on extending database technology, Saint-Petersburg, Russia, pp 910–921

  14. Cao HP, Qi Y, Candan S (2010) Feedback-driven result ranking and query refinement for exploring semi-structured data collections. In: Proceedings of the international conference on extending database technology, Lausanne, Switzerland, pp 3–14

  15. Chen LJ, Papakonstantinou Y (2011) Context-sensitive ranking for document retrieval. In: Proceedings of the ACM SIGMOD international conference on management of data, Athens, Greece, pp 757–768

  16. Dallachiesa M, Palpanas T, Ilyas IF (2014) Top-\(k\) nearest neighbor search in uncertain data series. PVLDB 8(1):13–24

    Google Scholar 

  17. Friedman N, Goldszmidt M, Lee TJ (1998) Bayesian network classification with continuous attributes: getting the best of both discretization and parametric fitting. In: Proceedings of the international conference on machine learning, Wisconsin, USA, pp 179–187

  18. Fagin R, Lotem A, Naor M (2001) Optimal aggregation algorithms for middleware. In: Proceedings of the symposium on principles of database systems, Santa Barbara, USA, pp 102–113

  19. Gan G, Ma C (2007) Data clustering: theory, algorithms, and applications. Soc Ind Appl Math 20(8):44–51

    Google Scholar 

  20. Jie L, Lamkhede S, Sapra R (2013) A unified search federation system based on online user feedback. In: Proceedings of the ACM SIGKDD conference on knowledge discovery and data mining, Chicago, USA, pp 1195–1203

  21. Jiang MH, Fu AW, Wong RC (2015) Exact top-\(k\) nearest keyword search in large networks. In: Proceedings of the ACM SIGMOD international conference on management of data, Melbourne, Australia, pp 393–404

  22. Kiebling W (2002) Foundations of preferences in database systems. In: Proceedings of the international conference on very large data bases, Hong Kong, China, pp 311–322

  23. Koutrika G, Ioannidis YE (2005) Constrained optimalities in query personalization. In: Proceedings of the ACM SIGMOD international conference on management of data, Baltimore, USA, pp 73–84

  24. Koutrika G, Ioannidis YE (2005) Personalized queries under a generalized preference model. In: Proceedings of the international conference on data engineering, Tokyo, Japan, pp 841–852

  25. Muslea I, Lee TJ (2005) Online query relaxation via Bayesian causal structures discovery. In: Proceedings of the 20th artificial intelligence conference, Pittsburgh, USA, pp 831–836

  26. Ma ZM, Yan L (2007) Generalization of strategies for fuzzy query translation in classical relational databases. Inf Softw Technol 49(2):172–180

    Article  Google Scholar 

  27. Meng XF, Ma ZM, Yan L (2009) Answering approximate queries over autonomous web databases. In: Proceedings of the 18th international world wide web conference, Madrid, Spain, pp 1021–1030

  28. Miele A, Quintarelli E, Rabosio E (2013) A data-mining approach to preference-based data ranking founded on contextual information. Inf Syst 38(4):524–544

    Article  Google Scholar 

  29. Mottin D, Marascu A, Roy SB (2014) IQR: an interactive query relaxation system for the empty-answer problem. In: Proceedings of the ACM SIGMOD international conference on data management, Snowbird, USA, pp 1095–1098

  30. Martinenghi D, Torlone R (2014) Taxonomy-based relaxation of query answering in relational databases. J VLDB 23(5):747–769

    Article  Google Scholar 

  31. Miele A, Quintarelli E, Rabosio E (2014) ADaPT: automatic data personalization based on contextual preferences. In: Proceedings of the international conference on data engineering, Chicago, USA, pp 1234–1237

  32. Nambiar U, Kambhampati S (2006) Answering imprecise queries over autonomous web databases. In: Proceedings of the international conference on data engineering, Atlanta, USA, pp 45–54

  33. Nguyen K, Cao JL (2012) Top-\(k\) answers for XML keyword queries. In: Proceedings of the international world wide web conference, Lyon, France, pp 485–515

  34. Su W, Wang J, Huang Q, Lochovsky F (2006) Query result ranking over e-commerce web databases. In: Proceedings of the 15th ACM conference on information and knowledge management, Kansas City, USA, pp 575–584

  35. Stefanidis K, Koutrika G, Pitoura E (2011) A survey on representation, composition and application of preferences in database systems. ACM Trans Database syst 36(3):1–45

    Article  Google Scholar 

  36. Santhanam GR, Basu S, Honavar V (2011) Representing and reasoning with qualitative preferences for compositional systems. J Artif Intell Res 42(1):211–274

    MathSciNet  MATH  Google Scholar 

  37. Telang A, Chakravarthy S, Li C (2013) Personalized ranking in web databases: establishing and utilizing an appropriate workload. Distrib Parallel Databases 31(1):47–70

    Article  Google Scholar 

  38. Tao WB, Yu MH, Li GL (2001) Efficient top-\(k\) simrank-based similarity join. PVLDB 8(3):317–328

    Google Scholar 

  39. Wang C, Cao LB, Wang MC (2011) Coupled nominal similarity in unsupervised learning. In: Proceedings of the 20th ACM conference on information and knowledge management, Glasgow, UK, pp 973–978

  40. Yager RR (2010) Soft querying of standard and uncertain databases. IEEE Trans Fuzzy Syst 18(2):336–347

    Google Scholar 

  41. Yu A, Agarwal PK, Yang J (2012) Processing a large number of continuous preference top-\(k\) queries. In: Proceedings of the ACM SIGMOD international conference on management of data, Scottsdale, USA, pp 397–408

Download references

Acknowledgments

This work is supported by the National Science Foundation for Young Scientists of China (No.61003162) and the Young Scholars Growth Plan of Liaoning (No. LJQ2013038).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangfu Meng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meng, X., Zhang, X., Tang, Y. et al. Adaptive query relaxation and top-k result ranking over autonomous web databases. Knowl Inf Syst 51, 395–433 (2017). https://doi.org/10.1007/s10115-016-0982-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-016-0982-4

Keywords

Navigation