A graph-theoretic approach to optimize keyword queries in relational databases

Park, Jaehui; Lee, Sang-goo

doi:10.1007/s10115-013-0690-2

A graph-theoretic approach to optimize keyword queries in relational databases

Regular Paper
Published: 16 October 2013

Volume 41, pages 843–870, (2014)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Jaehui Park¹ &
Sang-goo Lee²

372 Accesses
2 Citations
Explore all metrics

Abstract

Keyword search can provide users an easy method to query large and complex databases without any knowledge of structured query languages or underlying database schema. Most of the existing studies have focused on generating candidate structured queries relevant to keywords. Due to the large size of generated queries, the execution costs may be prohibitive. However, existing studies lack the idea of a generalized method to optimize the plan of the large set of generated queries. In this paper, we introduce a graph-theoretic optimization approach. We propose a general graph model, Weighted Operator Graph, to address the costs of keyword query evaluation plans. The proposed model is flexible to integrate all of the cost-based plans in a uniform way. We define a Keyword Query Optimization Problem based on a theoretical cost model as a graph-theoretic problem and show it to be a NP-hard problem. We propose a greedy heuristic Maximum Propagation that reduces the size of the intermediate result as early as possible. The proposed algorithm allows us to achieve efficiency in terms of query evaluation costs. The experimental studies on both synthetic and real data set results show that our work outperforms the existing work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Semantic Approach to Keyword Search over Relational Databases

An Effective Keyword Search Method for Graph-Structured Data Using Extended Answer Structure

Scalable top-k keyword search in relational databases

Article 06 October 2017

Yanwei Xu

References

Aditya B, Bhalotia G, Chakrabarti S, Hulgeri A, Nakhe C, Parag P, Sudarshan S (2002) BANKS: browsing and keyword searching in relational databases. In: Proceedings of the international conference on very large data bases (VLDB ’02), pp 1083–1086
Li G, Ooi BC, Feng J, Wang J, Zhou L (2008) EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD ’08), pp 903–914
Agrawal S, Chaudhuri S, Das G (2002) DBXplorer: a system for keyword-based search over relational databases. In: Proceedings of the IEEE ICDE international conference on data engineering (ICDE ’02), pp 5–16
Hristidis V, Gravano L, Papakonstantinou Y (2003) Efficient IR-style keyword search over relational databases. In: Proceedings of the international conference on very large data bases (VLDB ’03), pp 850–861
Hristidis V, Papakonstantinou Y (2002) DISCOVER: keyword search in relational databases. In: Proceedings of the international conference on very large data bases (VLDB ’02), pp 670–681
Luo Y, Wang W, Lin X (2008) SPARK: a keyword search engine on relational databases. In: Proceedings of the IEEE ICDE international conference on data engineering (ICDE ’08), pp 1552–1555
Pu KQ, Yu X (2008) Keyword query cleaning. Proc PVLDB 1(1):909–920
MathSciNet Google Scholar
Tao Y, Jeffrey XY (2009). Finding frequent co-occurring terms in relational keyword search. In: Proceedings of the international conference on extending database technology: advances in database technology (EDBT ’09), pp 839–850
Koutrika G, Mohammadi Zadeh Z, Garcia-Molina H (2009) Data clouds: summarizing keyword search results over structured data. In: Proceedings of the international conference on extending database technology: advances in database technology (EDBT ’09), pp 391–402
Markowetz A, Yang Y, Papadias D (2007) Keyword search on relational data streams. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD ’07), pp 605–616
Simitsis A, Koutrika G, Ioannidis Y (2008) Précis: from unstructured keywords as queries to structured databases as answers. VLDB J 17(1):117–149
Article Google Scholar
Qin L, Yu JX, Chang L (2011) Scalable keyword search on large data streams. VLDB J 20(1):35–57
Article Google Scholar
Qin L, Yu JX, Chang L (2010) Ten thousand SQLs: parallel keyword queries computing. Proc PVLDB 3(1–2):58–69
Google Scholar
Salton G, McGill MJ (1986) Introduction to modern information retrieval. McGraw-Hill, Inc., New York, NY
Google Scholar
Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, New York, NY
Book MATH Google Scholar
Kruskal JB (1956) On the shortest spanning subtree of a graph and the traveling salesman problem. Proc Am Math Soc 7(1):48–50
Article MATH MathSciNet Google Scholar
Karger DR, Klein PN, Tarjan RE (1995) A randomized linear-time algorithm to find minimum spanning trees. J ACM 42(2):321–328
Article MATH MathSciNet Google Scholar
Roy P, Seshadri S, Sudarshan S, Bhobe S (2000) Efficient and extensible algorithms for multi query optimization. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD ’00), pp 249–260
Park J, Lee S (2011) Keyword search in relational databases. Knowl Inf Syst 26(2):175–193
Article Google Scholar
Markowetz A, Yang Y, Papadias D (2009) Reachability indexes for relational keyword search. In: Proceedings of the IEEE ICDE international conference on data engineering (ICDE ’09), pp 1163–1166
Ding B, Xu Yu J, Wang S, Qin L (2007) Finding top-k min-cost connected trees in databases. In: Proceedings of the IEEE ICDE international conference on data engineering (ICDE ’07), pp 836–845
Demidova E, Zhou X, Zenz G, Nejdl W (2009) SUITS: faceted user interface for constructing structured queries from keywords. In: Proceedings of the international conference on database systems for advanced applications (DASFAA ’09), pp 772–775
Li G, Zhou X, Feng J, Wang J (2009) Progressive keyword search in relational databases. In: Proceedings of the IEEE ICDE international conference on data engineering (ICDE ’09), pp 1183–1186
Qin L, Yu JX, Chang L (2009) Keyword search in databases: the power of RDBMS. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD ’09), pp 681–694
Sayyadian M, Le khac H, Doan A, Gravano L (2007) Efficient keywords search across heterogeneous relational databases. In: Proceedings of the IEEE ICDE international conference on data engineering (ICDE ’07), pp 348–355
Tata S, Lohman GM (2008) SQAK: doing more with keywords. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD ’08), pp 889–902
Ganti V, He Y, Xin D (2010) Keyword++: a framework to improve keyword search over entity databases. Proc PVLDB 3(1–2):711–722
Google Scholar
Markowetz A, Yang Y, Papadias D (2009) Keyword search over relational tables and streams. ACM Trans Database Syst 34(3):1–51, Article 17
Google Scholar
Qin L, Yu JX, Chang L, Tao Y (2009) Querying communities in relational databases. In: Proceedings of the IEEE ICDE international conference on data engineering (ICDE ’09), pp 724–735
Kimelfeld B, Sagiv Y (2006) Finding and approximating top-k answers in keyword proximity search. In: Proceedings of the ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems (PODS ’06), pp 173–182
He H, Wang H, Yang J, Yu PS (2007) BLINKS: ranked keyword searches on graphs. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD ’07), pp 305–316
Kacholia V, Pandit S, Chakrabarti S, Sudarshan S, Desai R, Karambelkar H (2005) Bidirectional expansion for keyword search on graph databases. In: Proceedings of the international conference on very large data bases (VLDB ’05), pp 505–516
Luo Y, Wang W, Lin X, Zhou X, Wang J, Li K (2011) SPARK2: top-k keyword query in relational databases. IEEE Trans Knowl Data Eng 23(12):1763–1780
Article Google Scholar
Zhou B, Pei J (2009) Answering aggregate keyword queries on relational databases using minimal group-bys. In: Proceedings of the international conference on extending database technology: advances in database technology (EDBT ’09), pp 108–119
Stefanidis K, Drosou M, Pitoura E (2010) PerK: personalized keyword search in relational databases through preferences. In: Proceedings of the international conference on extending database technology (EDBT ’10), pp 585–596
Li G, Ji S, Li C, Feng J (2009) Efficient type-ahead search on relational data: a TASTIER approach. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD ’09), pp 695–706
Nambiar U, Kambhampati S (2006) Answering imprecise queries over autonomous web databases. In: Proceedings of the IEEE ICDE international conference on data engineering (ICDE ’06), pp 45–55
Rosenthal A, Chakravarthy US (1988) Anatomy of a modular multiple query optimizer. In: Proceedings of the international conference on very large databases (VLDB’ 88), pp 230–239
Sellis TK (1988) Multiple-query optimization. ACM Trans Database Syst 13(1):23–52
Article Google Scholar

Download references

Acknowledgments

This research was funded by the MSIP (Ministry of Science, ICT & Future Planning), Korea, in the ICT R&D Program 2013.

Author information

Authors and Affiliations

Electronics and Telecommunications Research Institute, 218 Gajeong-ro, Yuseong-gu, Daejeon, Republic of Korea
Jaehui Park
Seoul National University, 138-dong 418-ho, 1 Gwanak-ro, Gwanak-gu, Seoul, Republic of Korea
Sang-goo Lee

Authors

Jaehui Park
View author publications
You can also search for this author in PubMed Google Scholar
Sang-goo Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jaehui Park.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, J., Lee, Sg. A graph-theoretic approach to optimize keyword queries in relational databases. Knowl Inf Syst 41, 843–870 (2014). https://doi.org/10.1007/s10115-013-0690-2

Download citation

Received: 30 January 2013
Revised: 03 September 2013
Accepted: 14 September 2013
Published: 16 October 2013
Issue Date: December 2014
DOI: https://doi.org/10.1007/s10115-013-0690-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A graph-theoretic approach to optimize keyword queries in relational databases

Abstract

Access this article

Similar content being viewed by others

A Semantic Approach to Keyword Search over Relational Databases

An Effective Keyword Search Method for Graph-Structured Data Using Extended Answer Structure

Scalable top-k keyword search in relational databases

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A graph-theoretic approach to optimize keyword queries in relational databases

Abstract

Access this article

Similar content being viewed by others

A Semantic Approach to Keyword Search over Relational Databases

An Effective Keyword Search Method for Graph-Structured Data Using Extended Answer Structure

Scalable top-k keyword search in relational databases

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation