Skip to main content
Log in

Scalable top-k keyword search in relational databases

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Keyword search in relational databases has been widely studied in recent years because it does not require users neither to master a certain structured query language nor to know the complex underlying database schemas. There would be a huge number of valid results for a keyword query in a large database. However, only the top 10 or 20 most relevant matches for the keyword query—according to some definition of “Relevance”—are generally of interest. In this paper, we propose an efficient method which can efficiently compute the top-k results for keyword queries in a pipelined pattern, by incorporating the ranking mechanisms into the query processing method. Four optimization methods based on bounding the relevance scores of potential results, reusing and sharing the intermediate result are presented to improve the efficiency of the proposed algorithms. Compared to the existing top-k keyword search systems, the proposed methods can significantly reduce the number of computed query results with low relevance scores and the times for accessing databases, which result in the high efficiency in computing top-k keyword query results in relational databases. Extensive experiments on two real data sets are conducted to evaluate the effectiveness and efficiency of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. Note that the CN defined in KDynamic has some differences with ours.

  2. It has been proven to be a NP-complete problem in DISCOVER.

  3. It worth noting that every CN is started as an un-rooted tree, or a free tree.

  4. http://dblp.mpi-inf.mpg.de/dblp-mirror/index.php/.

  5. http://www.imdb.com.

References

  1. Yu, J.X., Qin, L., Chang, L.: Keyword Search in Relational Databases: A Survey. In: Bulletin of the IEEE Technical Committee on Data Engineering, vol. 33, no. 10 (2010)

  2. Hristidis, V., Gravano, L., Papakonstantinou, Y.: Efficient IR-style keyword search over relational databases. In: VLDB, pp. 850–861 (2003)

  3. Luo, Y., Lin, X., Wang, W., Zhou, X.: SPARK: top-k keyword query in relational databases. In: ACM SIGMOD, pp. 115–126 (2007)

  4. Luo, Y., Wang, W., Lin, X., Zhou, X., Wang, J., Li, K.: SPARK2: top-k keyword query in relational databases. IEEE Trans. Knowl. Data Eng. 23(12), 1763–1780 (2011)

    Article  Google Scholar 

  5. Hristidis, V., Papakonstantinou, Y.: DISCOVER: keyword search in relational databases. In: VLDB, pp. 670–681 (2002)

  6. Markowetz, A., Yang, Y., Papadias, D.: Keyword search on relational data streams. In: ACM SIGMOD, pp. 605–616 (2007)

  7. Qin, L., Yu, J.X., Chang, L., Tao, Y.: Scalable keyword search on large data streams. In: ICDE, pp. 1199–1202 (2009)

  8. Xu, Y., Guan, J., Ishikawa, Y.: Scalable top-k keyword search in relational databases. In: Database Systems for Advanced Applications—17th International Conference, DASFAA 2012, Proceedings, Part II, Busan, South Korea, 15–19 April 2012, pp. 65–80 (2012)

  9. Yu, J.X., Qin, L., Chang, L.: Keyword Search in Databases, Synthesis Lectures on Data Management. Morgan and Claypool Publishers, San Rafael (2010)

    Google Scholar 

  10. Xu, Y., Ishikawa, Y., Guan, J.: Efficient continuous top-k keyword search in relational databases. In: WAIM, pp. 755–767 (2010)

  11. Luo, Y.: SPARK: a keyword search system on relational databases. PhD Thesis, The University of New South Wales (2009)

  12. Cheng, S., Termehchy, A., Hristidis, V.: Predicting the effectiveness of keyword queries on databases. In: CIKM, pp. 1213–1222 (2012)

  13. Liu, F., Yu, C., Meng, W., Chowdhury, A.: Effective keyword search in relational databases. In: ACM SIGMOD, pp. 563–574 (2006)

  14. Bergamaschi, S., Ferro, N., Guerra, F., Silvello, G.: Keyword-based search over databases: a roadmap for a reference architecture paired with an evaluation framework. Trans. Comput. Collect. Intell. 21, 1–20 (2016)

    Google Scholar 

  15. Zhang, J., Peng, Z., Wang, S., Nie, H.: CLASCN: candidate network selection for efficient top-\(k\) keyword queries over databases. J. Comput. Sci. Technol. 22(2), 197–207 (2007)

    Article  Google Scholar 

  16. Aditya, B., Bhalotia, G., Chakrabarti, S., Hulgeri, A., Nakhe, C., Parag, P.: BANKS: browsing and keyword searching in relational databases. In: VLDB, pp. 1083–1086 (2002)

  17. He, H., Wang, H., Yang, J., Yu, P.S.: BLINKS: ranked keyword searches on graphs. In: ACM SIGMOD, New York, NY, USA, pp. 305–316. ACM (2007)

  18. Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., Karambelkar, H.: Bidirectional expansion for keyword search on graph databases. In: VLDB, pp. 505–516 (2005)

  19. Li, G., Ooi, B.C., Feng, J., Wang, J., Zhou, L.: EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In: ACM SIGMOD, pp. 903–914 (2008)

  20. Li, G., Zhou, X., Feng, J., Wang, J.: Progressive keyword search in relational databases. In: ICDE, pp. 1183–1186 (2009)

  21. Qin, L., Yu, J.X., Chang, L., Tao, Y.: Querying communities in relational databases. In: ICDE, pp. 724–735 (2009)

  22. Li, G., Feng, J., Zhou, X., Wang, J.: Providing built-in keyword search capabilities in RDBMS. VLDB J. 20(1), 1–19 (2011)

    Article  Google Scholar 

  23. Lopez-Veyna, J.I., Sosa, V.J.S., Lopez-Arevalo, I.: KESOSD: keyword search over structured data. In: KEYS, pp. 23–31 (2012)

  24. Kargar, M., An, A.: Efficient top-k keyword search in graphs with polynomial delay. In: ICDE, pp. 1269–1272 (2012)

  25. Ling, T.W., Le, T.N., Zeng, Z.: Towards an intelligent keyword search over XML and relational databases. In: International Conference on Big Data and Smart Computing, BIGCOMP 2014, Bangkok, Thailand, 15–17 January 2014, pp. 1–6 (2014)

  26. Torlone, R.: Towards a new foundation for keyword search in relational databases. In: Proceedings of the 8th Alberto Mendelzon Workshop on Foundations of Data Management, Cartagena de Indias, Colombia, 4–6 June 2014 (2014)

  27. Lin, Z., Li, Y., Lai, Y.: Improve the effectiveness of keyword search over relational database by node-temperature-based ant colony optimization. In: 12th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2015, Zhangjiajie, China, 15–17 August 2015, pp. 1209–1214 (2015)

  28. Ling, T.W., Zeng, Z., Le, T.N., Lee, M.: ORA-semantics based keyword search in XML and relational databases. In: 32nd IEEE International Conference on Data Engineering Workshops, ICDE Workshops 2016, Helsinki, Finland, 16–20 May 2016, pp. 157–160 (2016)

  29. Yu, Z., Yu, X., Chen, Y., Ma, K.: Distributed top-k keyword search over very large databases with MapReduce. In: 2016 IEEE International Congress on Big Data, San Francisco, CA, USA, 27 June–2 July 2016, pp. 349–352 (2016)

  30. Park, J., Lee, S.: Keyword search in relational databases. Knowl. Inf. Syst. 26(2), 175–193 (2011)

    Article  Google Scholar 

  31. Agrawal, S., Chaudhuri, S., Das, G.: DBXplorer: a system for keyword-based search over relational databases. In: ICDE, pp. 5–16 (2002)

  32. Qin, L., Yu, J.X., Chang, L.: Scalable keyword search on large data streams. VLDB J. 20(1), 35–57 (2011)

    Article  Google Scholar 

  33. Baid, A., Rae, I., Li, J., Doan, A., Naughton, J.F.: Toward scalable keyword search over relational data. PVLDB 3(1), 140–149 (2010)

    Google Scholar 

  34. Fakas, G.J.: A novel keyword search paradigm in relational databases: object summaries. Data Knowl. Eng. 70(2), 208–229 (2011)

    Article  Google Scholar 

  35. Xu, Y., Ishikawa, Y., Guan, J.: Efficient continual top-k keyword search in relational databases. J. Inf. Process. 20(1), 1–14 (2012)

    Google Scholar 

  36. Zeng, Z., Bao, Z., Ling, T.W., Lee, M.L.: iSearch: an interpretation based framework for keyword search in relational databases. In: KEYS, pp. 3–10 (2012)

  37. Xu, Y., Guan, J., Li, F., Zhou, S.: Scalable continual top-k keyword search in relational databases. Data Knowl. Eng. 86, 206–223 (2013)

    Article  Google Scholar 

  38. de Oliveira, P., da Silva, A.S., de Moura, E.S.: Ranking candidate networks of relations to improve keyword search over relational databases. In: 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, 13–17 April 2015, pp. 399–410 (2015)

  39. Zeng, Z., Bao, Z., Lee, M., Ling, T.W.: Towards an interactive keyword search over relational databases. In: Proceedings of the 24th International Conference on World Wide Web Companion, WWW 2015, Companion Volume, Florence, Italy, 18–22 May 2015, pp. 259–262 (2015)

  40. Kargar, M., An, A., Cercone, N., Godfrey, P., Szlichta, J., Yu, X.: Meaningful keyword search in relational databases with large and complex schema. In: 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, 13–17 April 2015, pp. 411–422 (2015)

  41. Zhou, J., Liu, Y., Yu, Z.: Improving the effectiveness of keyword search in databases using query logs. In: Web-Age Information Management—16th International Conference, WAIM 2015, Proceedings, Qingdao, China, 8–10 June 2015, pp. 193–206 (2015)

  42. Luo, Y., Wang, W., Lin, X.: SPARK: a keyword search engine on relational databases. In: ICDE, pp. 1552–1555 (2008)

  43. Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28, 129–136 (1982)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This research was supported by the Natural Science Foundation of Shanghai under Grant No. \(\sim \)14ZR1427700 and the Shanghai Engineering Research Center for Broadband Technologies and Applications (14DZ2280100).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanwei Xu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, Y. Scalable top-k keyword search in relational databases. Cluster Comput 22 (Suppl 1), 731–747 (2019). https://doi.org/10.1007/s10586-017-1232-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-1232-6

Keywords

Navigation