skip to main content
10.1145/2254736.2254743acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

KESOSD: keyword search over structured data

Published: 20 May 2012 Publication History

Abstract

Most of the information on the Web can be currently classified according to its (information) structure in three different forms: unstructured (plain text), semi-structured (XML files) and structured (tables in a relational database). Currently Web search is the primary way to access massive information. Keyword search also becomes an alternative of querying over relational databases and XML documents, which is simple to people who are familiar with the use of Web search engines. There are several approaches to perform keyword search over relational databases such as Steiner Trees, Candidate Networks and Tuple Units. However these methods have some constraints. The Steiner Trees method is considered a NP-hard problem, moreover, a real databases can produce a large number of Steiner Trees, which are difficult to identify and index. The Candidate Network approach first needs to generate the candidate networks and then to evaluate them to find the best answer. The problem is that for a keyword query the number of Candidate Networks can be very large and to find a common join expression to evaluate all the candidate networks could require a big computational effort. Finally, the use of Tuple Units in a general conception produce very large structures that most of the time store redundant information. To address this problem we propose a novel approach for keywords search over structured data (KESOSD). KESOSD models the structured information as graphs and proposed the use of a keyword-structure-aware-index called KSAI that captures the implicit structural relationships of the information producing fast and accuracy search responses. We have conducted some experiments and the results show that KESOSD achieves high search efficiency and high accuracy for keyword search over structured data.

References

[1]
S. Abiteboul and T. Allard. Webcontent: Efficient p2p warehousing of web data, 2008.
[2]
H. Achiezra and K. Golenberg. Exploratory keyword search on data graphs. In Proceedings of the 2010 international conference on Management of data (SIGMOD), pages 1163--1166. ACM, 2010.
[3]
S. Agrawal and S. Chaudhuri. Dbxplorer: A system for keyword-based search over relational databases. In Proceedings of the 18th International Conference on Data Engineering, ICDE '02. IEEE Computer Society, 2002.
[4]
Z. Bao and J. Lu. Towards an effective xml keyword search. IEEE Transactions on Knowledge and Data Engineering, 22(8):1077--1092, 2010.
[5]
G. Bhalotia and A. Hulgeri. Keyword searching and browsing in databases using banks. In Proceedings of the 18th International Conference on Data Engineering, ICDE '02, pages 431--440, 2002.
[6]
S. Chaudhuri and R. Ramakrishnan. Integrating db and ir technologies: What is the sound of one hand clapping. In Innovative Data Systems Research (CIDR), pages 1--12, 2005.
[7]
B. Ding and J. Xu. Finding top-k min-cost connected trees in databases, 2007.
[8]
X. Dong and A. Halevy. Indexing dataspaces. In Proceedings of the 2007 ACM SIGMOD international conference on Management of data, SIGMOD '07, pages 43--54. ACM, 2007.
[9]
D. Du and X. Hu. Steiner Tree problems in Computer Communication Networks. World Scientific Publishing, 2008.
[10]
J. Feng and G. Li. Finding top-k answers in keyword search over relational databases using tuple units. IEEE Transactions on Knowledge and Data Engineering, 23:1781--1794, 2011.
[11]
M. Franklin and A. Halevy. From databases to dataspaces: A new abstraction for information management. SIGMOD Record, 34:27--33, 2005.
[12]
H. He and H. Wang. Blinks: ranked keyword searches on graphs. In Proceedings of the 2007 ACM SIGMOD international conference on Management of data, SIGMOD '07, pages 305--316. ACM, 2007.
[13]
V. Hristidis and L. Gravano. Efficient ir-style keyword search over relational databases. In Proceedings of the 29th international conference on Very large data bases - Volume 29, VLDB '2003, pages 850--861. VLDB Endowment, 2003.
[14]
V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In Proceedings of the 28th international conference on Very Large Data Bases, pages 670--681. VLDB Endowment, 2002.
[15]
V. Hristidis and Y. Papakonstantinou. Keyword proximity search on xml graphs. In Proceedings. 19th International Conference Data Engineering, pages 367--378, 2003.
[16]
V. Kacholia and S. Pandit. Bidirectional expansion for keyword search on graph databases, 2005.
[17]
B. Kimelfeld and Y. Sagiv. Efficiently enumerating results of keyword search over data graphs. Information Systems, 33:335--359, 2008.
[18]
C. Lam. Hadoop in Action. Manning Publications Co., 2011.
[19]
G. Li and J. Feng. Ease: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data(SIGMOD), pages 903--914, 2008.
[20]
G. Li and J. Feng. Retrieving and materializing tuple units for effective keyword search over relational databases. In Lecture Notes in Computer Science, Conceptual Modeling - ER, pages 469--483, 2008.
[21]
G. Li and J. Feng. Providing built-in keyword search capabilities in rdbms, 2009.
[22]
F. Liu and C. Yu. Effective keyword search in relational databases. In Proceedings of the 2006 ACM SIGMOD international conference on Management of data, SIGMOD '06, pages 563--574. ACM, 2006.
[23]
L. Luo and X. Lin. Spark: top-k keyword query in relational databases. In Proceedings of the 2007 ACM SIGMOD international conference on Management of data, SIGMOD '07, pages 115--126. ACM, 2007.
[24]
K. S. M. Karnstedt. A dht-based infrastructure for ad-hoc integration and querying of semantic data. In Proceedings of the 2008 international symposium on Database engineering and applications, pages 19--28, 2008.
[25]
J. Park and S. goo Lee. Keyword search in relational databases. Knowl. Inf. Syst, 26(2):175--193, 2011.
[26]
Q. Su and J. Widom. Indexing relational database content offline for efficient keyword-based search. In Proceedings of the 9th International Database Engineering and Application Symposium (IDEAS), pages 297--306, 2005.
[27]
N. K. V. Hristidis. Keyword proximity search in xml trees. IEEE Transactions on Knowledge and Data Engineering, pages 525--539, 2006.
[28]
J. Xu and L. Qui. Keyword search in relational databases: A survey. Bulletin of the IEEE Computer Society Technical Comittee on Data Engineering, 33:67--78, 2010.
[29]
M. Zhong and M. Liu. Efficient keyword proximity search using a frontier-reduce strategy based on d-distance graph index. In Proceedings of the 2009 International Database Engineering & Applications Symposium (IDEAS), pages 206--216. ACM, 2009.

Cited By

View all

Index Terms

  1. KESOSD: keyword search over structured data

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      KEYS '12: Proceedings of the Third International Workshop on Keyword Search on Structured Data
      May 2012
      78 pages
      ISBN:9781450311984
      DOI:10.1145/2254736
      • General Chairs:
      • Ling Tok Wang,
      • Ge Yu,
      • Jiaheng Lu,
      • Wei Wang
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 May 2012

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. databases
      2. graphs
      3. indexing
      4. keyword search
      5. top-k
      6. virtual documents

      Qualifiers

      • Research-article

      Funding Sources

      • Fondo Mixto Conacyt-Gobierno del Estado de Tamaulipas

      Conference

      SIGMOD/PODS '12
      Sponsor:

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)5
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 25 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)A Taxonomy of Dataset SearchAdvances on Intelligent Computing and Data Science10.1007/978-3-031-36258-3_50(562-573)Online publication date: 17-Aug-2023
      • (2019)Operator implementation of Result Set Dependent KWS scoring functionsInformation Systems10.1016/j.is.2019.101465(101465)Online publication date: Nov-2019
      • (2018)OptiqueVQSSemantic Web10.3233/SW-1802939:5(627-660)Online publication date: 1-Jan-2018
      • (2017)The Trials and Tribulations of Working with Structured DataProceedings of the 2017 CHI Conference on Human Factors in Computing Systems10.1145/3025453.3025838(1277-1289)Online publication date: 2-May-2017
      • (2017)Scalable top-k keyword search in relational databasesCluster Computing10.1007/s10586-017-1232-6Online publication date: 6-Oct-2017
      • (2016)Experiencing OptiqueVQSUniversal Access in the Information Society10.1007/s10209-015-0404-515:1(129-152)Online publication date: 1-Mar-2016
      • (2014)A low redundancy strategy for keyword search in structured and semi-structured dataInformation Sciences: an International Journal10.1016/j.ins.2014.07.054288:C(135-152)Online publication date: 20-Dec-2014

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media