Skip to main content
Log in

Enabling Schema-Free XQuery with meaningful query focus

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

The widespread adoption of XML holds the promise that document structure can be exploited to specify precise database queries. However, users may have only a limited knowledge of the XML structure, and may be unable to produce a correct XQuery expression, especially in the context of a heterogeneous information collection. The default is to use keyword-based search and we are all too familiar with how difficult it is to obtain precise answers by these means. We seek to address these problems by introducing the notion of Meaningful Query Focus (MQF) for finding related nodes within an XML document. MQF enables users to take full advantage of the preciseness and efficiency of XQuery without requiring (perfect) knowledge of the document structure. Such a Schema-Free XQuery is potentially of value not just to casual users with partial knowledge of schema, but also to experts working in data integration or data evolution. In such a context, a schema-free query, once written, can be applied universally to multiple data sources that supply similar content under different schemas, and applied “forever” as these schemas evolve. Our experimental evaluation found that it is possible to express a wide variety of queries in a schema-free manner and efficiently retrieve correct results over a broad diversity of schemas. Furthermore, the evaluation of a schema-free query is not expensive: using a novel stack-based algorithm we developed for computing MQF, the overhead is from 1 to 4 times the execution time of an equivalent schema-aware query. The evaluation cost of schema-free queries can be further reduced by as much as 68% using a selectivity-based algorithm we develop to enable the integration of MQF operation into the query pipeline.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. INEX: http://inex.is.informatik.uni-duisburg.de:2004/

  2. TIMBER: http://www.eecs.umich.edu/db/timber

  3. WordNet: http://www.cogsci.princeton.edu/~wn/

  4. XMark: http://monetdb.cwi.nl/xml/index.html

  5. Aditya, B. et al.: BANKS: Browsing and keyword searching in relational databases. VLDB (2002)

  6. Agrawal, S. et al.: DBXplorer: a system for keyword-based search over relational databases. ICDE (2002)

  7. Al-Khalifa, S. et al.: Structural joins: A primitive for efficient XML query pattern matching. ICDE (2001)

  8. Al-Khalifa, S. et al.: Querying structured text in an XML database. SIGMOD (2003)

  9. Amer-Yahai, S. et al.: FleXPath: Flexible structure and full-text querying for XML. SIGMOD (2004)

  10. Amer-Yahia, S. et al.: TeXQuery: A full-text search extension to XQuery. WWW (2004)

  11. Bruno, N. et al.: Holistic twig joins: Optimal XML pattern matching. SIGMOD (2002)

  12. Burton-Jones, A. et al.: A heuristic-based methodology for semantic augmentation of user queries on the Web. ER (2003)

  13. Carmel, D. et al.: Searching XML documents via XML fragments. SIGIR (2003)

  14. Chamberlin, D.: XQuery: An XML query language. IBM Syst. J. 41, 597–615 (2003)

    Article  Google Scholar 

  15. Chien, S.-Y. et al.: Efficient structural joins on indexed XML documents. VLDB (2002)

  16. Chinenyanga, T.T., Kushmerick, N.: Expressive and efficient ranked querying of XML data. WebDB (2001)

  17. Cohen, S. et al.: XSEarch: A semantic search engine for XML. VLDB (2003)

  18. Deerwester, S. et al.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. (1990)

  19. Florescu, D. et al.: Integrating keyword search into XML query processing. Comput. Netw. 33, 119–135 (2000)

    Article  Google Scholar 

  20. Fuhr, N., Großjohann, K.: XIRQL: An extension of XQL for information retrieval. SIGIR (2000)

  21. Goldman, R. et al.: Proximity search in databases. VLDB (1998)

  22. Guo, L. et al.: XRANK: Ranked keyword search over XML documents. SIGMOD (2003)

  23. Halevy, A. et al.: Crossing the structure chasm. CIDR (2003)

  24. Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13(2), 338–355 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  25. Hristidis, V. et al.: Keyword proximity search on XML graphs. ICDE (2003)

  26. Hristidis, V., Papakonstantinou, Y.: Discover: Keyword search in relational databases. VLDB (2002)

  27. Jagadish, H.V. et al.: TIMBER: A native XML database. VLDB J. 11(4), 274–291 (2002)

    Article  MATH  Google Scholar 

  28. Ley, M.: DBLP bibliography (2003)

  29. Li, Y. et al.: NaLIX: An interactive natural language interface for querying XML. SIGMOD (2005)

  30. Quass, D. et al.: Querying semistructured heterogeneous information. DOOD (1995)

  31. Resnik, P.S.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural langauge. J. Artif. Intell. Res. 11, 95–130 (1999)

    MATH  Google Scholar 

  32. Schieber, B., Vishkin, U.: On finding lowest common ancestors: Simplification and parallelization. SIAM J. Comput. 17(6), 1253–1262 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  33. Schlieder, T.: Similarity search in {XML} data using cost-based query tranformations. SIGMOD (2001)

  34. Schmidt, A. et al.: Querying XML documents made easy: Nearest concept queries. ICDE (2001)

  35. Theobald, A., Weikum, G.: The index-based XXL search engine for querying XML data with relevance ranking. EDBT (2002)

  36. W3C: XML Query Use Cases. W3C Working Draft. Available at http://www.w3.org/TR/xquery-use-cases/ (2003)

  37. W3C: XML Schema. W3C Recommendation. Available at http://www.w3.org/XML/Schema (2003)

  38. W3C: XQuery 1.0. W3C Working Draft. Available at http://www.w3.org/TR/xquery/ (2004)

  39. W3C: XQuery 1.0 and XPath 2.0 Full-Text. W3C Working Draft. Available at http://www.w3.org/TR/xquery-full-text/ (2005)

  40. Wen, Z.: New algorithms for the LCA problem and the binary tree reconstruction problem. Inf. Process. 51(1), 11–16 (1994)

    Google Scholar 

  41. Xu, Y., Papakonstantinou, Y.: Efficient keyword search for smallest LCAs in XML databases. SIGMOD (2005)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunyao Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Y., Yu, C. & Jagadish, H.V. Enabling Schema-Free XQuery with meaningful query focus. The VLDB Journal 17, 355–377 (2008). https://doi.org/10.1007/s00778-006-0003-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-006-0003-4

Keywords

Navigation