ABSTRACT
Keyword search is integrated in many applications on account of the convenience to convey users' query intention. Recently, answering keyword queries on XML data has drawn the attention of web and database communities, because the success of this research will relieve users from learning complex XML query languages, such as XPath/XQuery, and/or knowing the underlying schema of the queried XML data. As a result, information in XML data can be discovered much easier.
To model the result of answering keyword queries on XML data, many LCA (lowest common ancestor) based notions have been proposed. In this paper, we focus on ELCA (Exclusive LCA) semantics, which is first proposed by Guo et al. and afterwards named by Xu and Papakonstantinou. We propose an algorithm named Hash Count to find ELCAs efficiently. Our analysis shows the complexity of Hash Count algorithm is O(kd|S1|), where k is the number of keywords, d is the depth of the queried XML document and |S1| is the frequency of the rarest keyword. This complexity is the best result known so far. We also evaluate the algorithm on a real DBLP dataset, and compare it with the state-of-the-art algorithms. The experimental results demonstrate the advantage of Hash Count algorithm in practice.
- DBLP XML Records. In http://dblp.uni-trier.de/xml/.Google Scholar
- Oracle Berkeley DB. In http://www.oracle.com/technology/products/berkeley-db/index.html.Google Scholar
- S. Agrawal, S. Chaudhuri, and G. Das. Dbxplorer: A system for keyword-based search over relational databases. In ICDE, pages 5--16, 2002. Google ScholarDigital Library
- G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using banks. In ICDE, pages 431--440, 2002. Google ScholarDigital Library
- S. Cohen, J. Mamou, Y. Kanza, and Y. Sagiv. Xsearch: A semantic search engine for xml. In VLDB, pages 45--56, 2003. Google ScholarDigital Library
- R. Goldman, N. Shivakumar, S. Venkatasubramanian, and H. Garcia-Molina. Proximity search in databases. In VLDB, pages 26--37. Morgan Kaufmann, 1998. Google ScholarDigital Library
- K. Golenberg, B. Kimelfeld, and Y. Sagiv. Keyword proximity search in complex data graphs. In SIGMOD Conference, pages 927--940, 2008. Google ScholarDigital Library
- L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. Xrank: Ranked keyword search over xml documents. In SIGMOD Conference, pages 16--27, 2003. Google ScholarDigital Library
- H. He, H. Wang, J. Yang, and P. S. Yu. Blinks: ranked keyword searches on graphs. In SIGMOD Conference, pages 305--316, 2007. Google ScholarDigital Library
- V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In VLDB, pages 670--681, 2002. Google ScholarDigital Library
- V. Hristidis, Y. Papakonstantinou, and A. Balmin. Keyword proximity search on xml graphs. In ICDE, pages 367--378, 2003.Google ScholarCross Ref
- V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In VLDB, pages 505--516, 2005. Google ScholarDigital Library
- L. Kong, R. Gilleron, and A. Lemay. Retrieving meaningful relaxed tightest fragments for xml keyword search. In EDBT, pages 815--826, 2009. Google ScholarDigital Library
- G. Li, J. Feng, J. Wang, and L. Zhou. Effective keyword search for valuable lcas over xml documents. In CIKM, pages 31--40, 2007. Google ScholarDigital Library
- Y. Li, C. Yu, and H. V. Jagadish. Schema-free xquery. In VLDB, pages 72--83, 2004. Google ScholarDigital Library
- F. Liu, C. T. Yu, W. Meng, and A. Chowdhury. Effective keyword search in relational databases. In SIGMOD Conference, pages 563--574, 2006. Google ScholarDigital Library
- Z. Liu and Y. Chen. Identifying meaningful return information for xml keyword search. In SIGMOD Conference, pages 329--340, 2007. Google ScholarDigital Library
- Z. Liu and Y. Chen. Reasoning and identifying relevant matches for xml keyword search. PVLDB, 1(1):921--932, 2008. Google ScholarDigital Library
- Y. Luo, X. Lin, W. Wang, and X. Zhou. Spark: top-k keyword query in relational databases. In SIGMOD Conference, pages 115--126, 2007. Google ScholarDigital Library
- L. Qin, J. X. Yu, and L. Chang. Keyword search in databases: the power of rdbms. In SIGMOD Conference, pages 681--694, 2009. Google ScholarDigital Library
- A. Schmidt, M. L. Kersten, and M. Windhouwer. Querying xml documents made easy: Nearest concept queries. In ICDE, pages 321--329, 2001. Google ScholarDigital Library
- C. Sun, C. Y. Chan, and A. K. Goenka. Multiway slca-based keyword search in xml data. In WWW, pages 1043--1052, 2007. Google ScholarDigital Library
- W. Wang, X. Wang, and A. Zhou. Hash-search: An efficient slca-based keyword search algorithm on xml documents. In DASFAA, pages 496--510, 2009. Google ScholarDigital Library
- Y. Xu and Y. Papakonstantinou. Efficient keyword search for smallest lcas in xml databases. In SIGMOD Conference, pages 537--538, 2005. Google ScholarDigital Library
- Y. Xu and Y. Papakonstantinou. Efficient lca based keyword search in xml data. In EDBT, pages 535--546, 2008. Google ScholarDigital Library
Recommendations
ELCA evaluation for keyword search on probabilistic XML data
As probabilistic data management is becoming one of the main research focuses and keyword search is turning into a more popular query means, it is natural to think how to support keyword queries on probabilistic XML data. With regards to keyword query ...
Computing graphical queries over XML data
The rapid evolution of XML from a mere data exchange format to a universal syntax for encoding domain-specific information raises the need for new query languages specifically conceived to address the characteristics of XML. Such languages should be ...
Supporting range queries in XML keyword search
EDBT '13: Proceedings of the Joint EDBT/ICDT 2013 WorkshopsXML data is normally queried by rigorous structured query languages, e.g., XPath, XQuery, etc. In recent years keyword search has become more and more popular because it provides a more user-friendly way to explore data. Keyword search on XML data has ...
Comments