skip to main content
10.1145/1739041.1739107acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Fast ELCA computation for keyword queries on XML data

Authors Info & Claims
Published:22 March 2010Publication History

ABSTRACT

Keyword search is integrated in many applications on account of the convenience to convey users' query intention. Recently, answering keyword queries on XML data has drawn the attention of web and database communities, because the success of this research will relieve users from learning complex XML query languages, such as XPath/XQuery, and/or knowing the underlying schema of the queried XML data. As a result, information in XML data can be discovered much easier.

To model the result of answering keyword queries on XML data, many LCA (lowest common ancestor) based notions have been proposed. In this paper, we focus on ELCA (Exclusive LCA) semantics, which is first proposed by Guo et al. and afterwards named by Xu and Papakonstantinou. We propose an algorithm named Hash Count to find ELCAs efficiently. Our analysis shows the complexity of Hash Count algorithm is O(kd|S1|), where k is the number of keywords, d is the depth of the queried XML document and |S1| is the frequency of the rarest keyword. This complexity is the best result known so far. We also evaluate the algorithm on a real DBLP dataset, and compare it with the state-of-the-art algorithms. The experimental results demonstrate the advantage of Hash Count algorithm in practice.

References

  1. DBLP XML Records. In http://dblp.uni-trier.de/xml/.Google ScholarGoogle Scholar
  2. Oracle Berkeley DB. In http://www.oracle.com/technology/products/berkeley-db/index.html.Google ScholarGoogle Scholar
  3. S. Agrawal, S. Chaudhuri, and G. Das. Dbxplorer: A system for keyword-based search over relational databases. In ICDE, pages 5--16, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using banks. In ICDE, pages 431--440, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Cohen, J. Mamou, Y. Kanza, and Y. Sagiv. Xsearch: A semantic search engine for xml. In VLDB, pages 45--56, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Goldman, N. Shivakumar, S. Venkatasubramanian, and H. Garcia-Molina. Proximity search in databases. In VLDB, pages 26--37. Morgan Kaufmann, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Golenberg, B. Kimelfeld, and Y. Sagiv. Keyword proximity search in complex data graphs. In SIGMOD Conference, pages 927--940, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. Xrank: Ranked keyword search over xml documents. In SIGMOD Conference, pages 16--27, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. He, H. Wang, J. Yang, and P. S. Yu. Blinks: ranked keyword searches on graphs. In SIGMOD Conference, pages 305--316, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In VLDB, pages 670--681, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. V. Hristidis, Y. Papakonstantinou, and A. Balmin. Keyword proximity search on xml graphs. In ICDE, pages 367--378, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  12. V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In VLDB, pages 505--516, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. L. Kong, R. Gilleron, and A. Lemay. Retrieving meaningful relaxed tightest fragments for xml keyword search. In EDBT, pages 815--826, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Li, J. Feng, J. Wang, and L. Zhou. Effective keyword search for valuable lcas over xml documents. In CIKM, pages 31--40, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Li, C. Yu, and H. V. Jagadish. Schema-free xquery. In VLDB, pages 72--83, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. F. Liu, C. T. Yu, W. Meng, and A. Chowdhury. Effective keyword search in relational databases. In SIGMOD Conference, pages 563--574, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Z. Liu and Y. Chen. Identifying meaningful return information for xml keyword search. In SIGMOD Conference, pages 329--340, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Z. Liu and Y. Chen. Reasoning and identifying relevant matches for xml keyword search. PVLDB, 1(1):921--932, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Luo, X. Lin, W. Wang, and X. Zhou. Spark: top-k keyword query in relational databases. In SIGMOD Conference, pages 115--126, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. Qin, J. X. Yu, and L. Chang. Keyword search in databases: the power of rdbms. In SIGMOD Conference, pages 681--694, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Schmidt, M. L. Kersten, and M. Windhouwer. Querying xml documents made easy: Nearest concept queries. In ICDE, pages 321--329, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. Sun, C. Y. Chan, and A. K. Goenka. Multiway slca-based keyword search in xml data. In WWW, pages 1043--1052, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. W. Wang, X. Wang, and A. Zhou. Hash-search: An efficient slca-based keyword search algorithm on xml documents. In DASFAA, pages 496--510, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Xu and Y. Papakonstantinou. Efficient keyword search for smallest lcas in xml databases. In SIGMOD Conference, pages 537--538, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Xu and Y. Papakonstantinou. Efficient lca based keyword search in xml data. In EDBT, pages 535--546, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    EDBT '10: Proceedings of the 13th International Conference on Extending Database Technology
    March 2010
    741 pages
    ISBN:9781605589459
    DOI:10.1145/1739041

    Copyright © 2010 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 22 March 2010

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate7of10submissions,70%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader