Skip to main content
Log in

Containment of partially specified tree-pattern queries in the presence of dimension graphs

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Nowadays, huge volumes of data are organized or exported in tree-structured form. Querying capabilities are provided through tree-pattern queries. The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. In this paper, we consider a query language that allows the partial specification of a tree pattern. Queries in this language range from structureless keyword-based queries to completely specified tree patterns. To support the evaluation of partially specified queries, we use semantically rich constructs, called dimension graphs, which abstract structural information of the tree-structured data. We address the problem of query containment in the presence of dimension graphs and we provide necessary and sufficient conditions for query containment. As checking query containment can be expensive, we suggest two heuristic approaches for query containment in the presence of dimension graphs. Our approaches are based on extracting structural information from the dimension graph that can be added to the queries while preserving equivalence with respect to the dimension graph. We considered both cases: extracting and storing different types of structural information in advance, and extracting information on-the-fly (at query time). Both approaches are implemented, validated, and compared through experimental evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. XML Path Language (XPath). World Wide Web Consortium site, W3C XPath: http://www.w3.org/TR/xpath20

  2. XML Query (XQuery). World Wide Web Consortium site, W3C XQuery: http://www.w3.org/XML/Query

  3. Amer-Yahia, S., Cho, S., Lakshmanan, L.V.S., Srivastava, D.: Minimization of tree pattern queries. In: Proceedings of the ACM SIGMOD Intl. Conf. on Management of Data, pp. 497–508, Santa Barbara (2001)

  4. Amer-Yahia, S., Cho, S., Srivastava, D.: Tree pattern relaxation. In: Proc. of the 8th Intl. Conf. on Extending Database Technology, Prague (2002)

  5. Amer-Yahia, S., Lakshmanan, L.V.S., Pandit, S.: Flexpath: flexible structure and full-text querying for xml. In: Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pp. 83–94 (2004)

  6. Barta, A., Consens, M.P., Mendelzon, A.O.: Benefits of path summaries in an XML query optimizer supporting multiple access methods. In: Proc. of the 31st Intl. Conf. on Very Large Data Bases, pp. 133–144 (2005)

  7. Benedikt, M., Fundulaki, I.: Xml subtree queries: specification and composition. In: Proc. of the Intl. Workshop on Database Programming Languages (DBPL’05), pp. 138–153, Trondheim (2005)

  8. Chen, L., Rundensteiner, E.A.: Xquery containment in presence of variable binding dependencies. In: Proc. of the 14th Intl. Conf. on World Wide Web, pp. 288–297 (2005)

  9. Cluet, S., Veltri, P., Vodislav, D.: Views in a large scale xml repository. In: Proc. of the 27th Intl. Conf. on Very Large Data Bases (2001)

  10. Cohen, S., Mamou, J., Kanza, Y., Sagiv, Y.: XSearch: a semantic search engine for XML. In: Proc. of the 29th Intl. Conf. on Very Large Data Bases (2003)

  11. Deutsch, A., Tannen, V.: Containment and integrity constraints for xpath. In: Proc. of the 8th Intl. Workshop on Knowledge Representation meets Databases (2001)

  12. Dong, X., Halevy, A.Y., Tatarinov, I.: Containment of nested XML queries. In: Proc. of the 30th Intl. Conf. on Very Large Data Bases, pp. 132–143 (2004)

  13. Florescu, D., Kossmann, D., Manolescu, I.: Integrating keyword search into xml query processing. Comput. Netw. 33(1–6), 119–135 (2000)

    Article  Google Scholar 

  14. Goldman, R., Widom, J.: DataGuides: enabling query formulation and optimization in semistructured databases. In: Proc. of the 23rd Intl. Conf. on Very large Databases, pp. 436–445 (1997)

  15. Guha, S., Jagadish, H.V., Koudas, N., Srivastava, D., Yu, T.: Approximate XML joins. In: Proceedings of the ACM SIGMOD Intl. Conf. on Management of Data, Madison, pp. 287–298 (2002)

  16. Hidders, J.: Satisfiability of XPath expressions. In: Proc. of the 9th Intl. Workshop on Database Programming Languages, pp. 21–36 (2003)

  17. Hristidis, V., Papakonstantinou, Y., Balmin, A.: Keyword proximity search on XML graphs. In: Proc. of the 19th Intl. Conf. on Data Engineering, pp. 367–378 (2003)

  18. Kanza, Y., Sagiv, Y.: Flexible queries over semistructured data. In: Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (2001)

  19. Kaushik, R., Bohannon, P., Naughton, J.F., Korth, H.F.: Covering indexes for branching path queries. In: Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, Madison, pp. 133–144 (2002)

  20. Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting local similarity for indexing paths in graph-structured data. In: Proc. of the 18th Intl. Conf. on Data Engineering, pp. 129–140 (2002)

  21. Lakshmanan, L.V., Wang, H.W., Zhao, Z.J.: Answering tree pattern queries using views. In: Proc. of the 32nd Intl. Conf. on Very Large Data Bases (2006)

  22. Lakshmanan, L.V.S., Ramesh, G., Wang, H.W., Zhao, Z.J.: On testing satisfiability of tree pattern queries. In: Proc. of the 30th Intl. Conf. on Very Large Data Bases, pp. 120–130 (2004)

  23. Li, Y., Yu, C., Jagadish, H.V.: Schema-free xquery. In: Proc. of the 30th Intl. Conf. on Very Large Data Bases, pp. 72–83 (2004)

  24. Liu, Z., Chen, Y.: Identifying meaningful return information for xml keyword search. In: Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pp. 329–340 (2007)

  25. Miklau, G., Suciu, D.: Containment and equivalence for an XPath fragment. In: Proc. of the 21st ACM Symp. on Principles of Database Systems, pp. 65–76 (2002)

  26. Milo, T., Suciu, D.: Index structures for path expressions. In: Proc. of the 9th Intl. Conf. on Database Theory, pp. 277–295 (1999)

  27. Neven, F., Schwentick, T.: XPath containment in the presence of disjunction, DTDs, and variables. In: Proc. of the 13th Intl. Conf. on Database Theory, Sienna, pp. 315–329 (2003)

  28. Papakonstantinou, Y., Vassalos, V.: Query rewriting for semistructured data. In: SIGMOD Conference, pp. 455–466 (1999)

  29. Polyzotis, N., Garofalakis, M.: Statistical synopsis for graph-structured XML databases. In: Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, Madison (2002)

  30. Polyzotis, N., Garofalakis, M., Ioannidis, Y.: Approximate XML query answers. In: Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, Paris, pp. 263–274 (2004)

  31. Ramanan, P.: Efficient algorithms for minimizing tree pattern queries. In: Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, Madison, pages 299–309 (2002)

  32. Schmidt, A., Kersten, M.L., Windhouwer, M.: Querying XML documents made easy: nearest concept queries. In: Proc. of the 17th Intl. Conf. on Data Engineering, pp. 321–329 (2001)

  33. Theodoratos, D., Dalamagas, T., Koufopoulos, A., Gehani, N.: Semantic querying of tree-structured data sources using partially specified tree-patterns. In: Proc. of the 14th ACM Intl. Conf. on Information and Knowledge Management, pp. 712–719 (2005)

  34. Theodoratos, D., Dalamagas, T., Placek, P., Souldatos, S., Sellis, T.: Containment of partially specified tree-pattern queries. In: Proc. of the Intl. Conference on Scientific and Statistical Databases, pp. 3–12 (2006)

  35. Theodoratos, D., Souldatos, S., Dalamagas, T., Placek, P., Sellis, T.: Heuristic containment check of partial tree-pattern queries in the presence of index graphs. In: Proc. of the 15th ACM Intl. Conf. on Information and Knowledge Management, pp. 445–454 (2006)

  36. Wood, P.T.: Minimising simple XPath expressions. In: Informal Proc. of the 4th Intl. Workshop on the Web and Databases, pp. 13–18 (2001)

  37. Wood, P.T.: Containment for XPath fragments under DTD constraints. In: Proc. of the 13th Intl. Conf. on Database Theory, Sienna, pp. 300–314 (2003)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dimitri Theodoratos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Theodoratos, D., Placek, P., Dalamagas, T. et al. Containment of partially specified tree-pattern queries in the presence of dimension graphs. The VLDB Journal 18, 233–254 (2009). https://doi.org/10.1007/s00778-008-0097-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-008-0097-y

Keywords

Navigation