skip to main content
10.1145/1416691.1416696acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmobicaseConference Proceedingsconference-collections
research-article

A cost-based join selection for XML twig content-based queries

Published:25 March 2008Publication History

ABSTRACT

XML (Extensible Mark-up Language) has been embraced as a new approach to data modeling. Nowadays, more and more information is formated as semi-structured data, e.g., articles in a digital library, documents on the web, and so on. Implementation of an efficient system enabling storage and querying of XML documents requires development of new techniques.

Many different techniques of XML indexing have been proposed during recent years. If we consider some classes of indexing methods, we distinguish two kinds of joins for processing twig queries. The first join merges two sets retrieved from an inverted list. The second join applies the first query result in building the second query. Although authors propose improvements of their joins, there has not yet been a discussion about the advantages of applying various join operations. In this article, we propose a join selection based on the cost of a join. By choosing a more appropriate join operation, twig query processing efficiency is significantly improved.

References

  1. S. Al-Khalifa, H. V. Jagadish, and N. Koudas. Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In Proceedings of International Conference on Data Engineering, ICDE 2002. IEEE Computer Society, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. Bruno, D. Srivastava, and N. Koudas. Holistic Twig Joins: Optimal XML Pattern Matching. In Proceedings of the ACM International Conference on Management of Data, SIGMOD 2002, pages 310--321. ACM Press, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Chaudhuri. An Overview of Query Optimization in Relational Systems. In Proceedings of the 17th ACM Symposium on Principles of Database Systems, PODS 1998, pages 34--43. ACM Press, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Chen, H.-G. Li, J. Tatemura, W.-P. Hsiung, D. Agrawal, and K. S. Candan. Twig2Stack: Bottom-up Processing of Generalized-tree-pattern Queries Over XML documents. In Proceedings of International Conference on Very Large Databases, VLDB 2006, pages 283--294. VLDB Endowment, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T. Chen, J. Lu, and T. Ling. On Boosting Holism in XML Twig Pattern Matching Using Structural Indexing Techniques. Proceedings of the ACM International Conference on Management of Data, SIGMOD 2005, pages 455--466, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Z. Chen, G. Korn, F. Koudas, N. Shanmugasundaram, and J. Srivastava. Index Structures for Matching XML Twigs Using Relational Query Processors. In Proceedings of 13th International Conference on Data Engineering, ICDE 2005, pages 1273--1273. IEEE Computer Society, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C.-W. Chung, J.-K. Min, and K. Shim. APEX: an Adaptive Path Index for XML Data. In Proceedings of the ACM International Conference on Management of Data, SIGMOD 2002, pages 121--132, New York, NY, USA, 2002. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. Cooper, N. Sample, M. J. Franklin, G. R. Hjaltason, and M. Shadmon. A Fast Index for Semistructured Data. In Proceedings of the 27th International Conference on Very Large Databases, VLDB 2001, pages 341--350, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. Frasincar, G.-J. Houben, and C. Pau. XAL: an Algebra for XML Query Optimization. In Proceedings of the 13th Australasian Database Conference, ADC 2002, pages 49--56. Australian Computer Society, Inc., 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Grust, M. van Keulen, and J. Teubner. Staircase Join: Teach a Relational DBMS to Watch Its (Axis) Steps. In Proceedings of the 29th, International Conference on Very Large Databases, VLDB 2003, pages 524--535. VLDB Endowment, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Halverson and et al. Mixed Mode XML Query Processing. In Proceedings of the 29th International Conference on Very Large Data Bases, VLDB 2003, pages 225--236. VLDB Endowment, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. W. H. Hanyu Li, Mong Li Lee. A Path-Based Labeling Scheme for Efficient Structural Join. In Proceedings of XSym 2005, pages 34--48. Springer--Verlag, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Härder, M. Haustein, C. Mathis, and M. Wagner. Node Labeling Schemes for Dynamic XML Documents Reconsidered. Data & Knowledge Engineering, 60(1):126--149, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. H. Jiang, W. Wang, H. Lu, and J. Yu. Holistic twig joins on indexed XML documents. Proceedings of the 29th International Conference on Very Large Databases, VLDB 2003, pages 273--284, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Krátký, R. Bača, and V. Snášel. Implementation of XPath Axes in the Multi-dimensional Approach to Indexing XML Data. In Proceedings of the 18th International Conference on Database and Expert Systems Applications, DEXA 2007, volume LNCS 4653/2007. Springer--Verlag, 2007.Google ScholarGoogle Scholar
  16. M. Krátký, J. Pokorný, and V. Snášel. Implementation of XPath Axes in the Multi-dimensional Approach to Indexing XML Data. In Current Trends in Database Technology, EDBT 2004, volume LNCS 3268/2004. Springer--Verlag, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Q. Li and B. Moon. Indexing and Querying XML Data for Regular Path Expressions. In Proceedings of the 27th International Conference on Very Large Databases, VLDB 2001, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Lu, T. Ling, C. Chan, and T. Chen. From Region Encoding to Extended Dewey: on Efficient Processing of XML Twig Pattern Matching. Proceedings of the 31st International Conference on Very Large Databases, VLDB 2005, pages 193--204, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. S. M. Yoshikawa, T. Amagasa and S. Uemura. XRel: a Path-based Approach to Storage and Retrieval of XML Documents Using Relational Databases. ACM Trans. Inter. Tech., 1(1):110--141, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. N. May, M. Brantner, A. Böhm, C.-C. Kanne, and G. Moerkotte. Index vs. Navigation in XPath Evaluation. In Proceedings of Database and XML Technologies, XSym 2006, volume LNCS 4156/2006, pages 16--30. Springer--Verlag, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. N. Polyzotis and M. Garofalakis. Structure and Value Synopses for XML Data Graphs. In Proceedings of International Conference on Very Large Databases, VLDB 2002. Morgan Kaufmann, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. N. Polyzotis and M. Garofalakis. XSKETCH Synopses for XML Data Graphs. ACM Trans. Database Syst., 31(3):1014--1063, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. H. Samet. Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. R. Schmidt and at al. The XML Benchmark. Technical Report INS-R0103, CWI, The Netherlands, April, 2001, http://monetdb.cwi.nl/xml/.Google ScholarGoogle Scholar
  25. J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D. DeWitt, and J. Naughton. Relational Databases for Querying XML Documents: Limitations and Opportunities. In Proceedings of the 25th International Conference on Very Large Databases, VLDB 1999. Edinburgh, Scotland, UK, pages 302--314. Morgan Kaufmann, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. S. Prakas and S. Madria. SUCXENT: An Efficient Path-Based Approach to Store and Query XML Documents. In Proceedings of Database and Expert Systems Applications, DEXA 2004, volume LNCS 3180/2004, pages 285--295. Springer-Verlag, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  27. I. Tatarinov and at al. Storing and Querying Ordered XML Using a Relational Database System. In Proceedings of the ACM International Conference on Management of Data, SIGMOD 2002, pages 204--215, New York, USA, 2002. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. van Keulen. Relational Approach to Logical Query Optimization of XPath. In Proceedings of the 1st Twente Data Management Workshop, TDM'04, pages 57--63, 2004.Google ScholarGoogle Scholar
  29. W3 Consortium. XQuery 1.0: An XML Query Language, W3C Working Draft, 12 November 2003, http://www.w3.org/TR/xquery/.Google ScholarGoogle Scholar
  30. W3 Consortium. XML Path Language (XPath) Version 2.0, W3C Working Draft, 15 November 2002, http://www.w3.org/TR/xpath20/.Google ScholarGoogle Scholar
  31. Y. Wu, J. Patel, and H. Jagadish. Estimating Answer Sizes for XML Queries. In Proceedings of Advances in Database Technology -- EDBT 2002, LNCS, Volume 2287/2002. Springer-Verlag, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Y. Wu, J. M. Patel, and H. Jagadish. Structural Join Order Selection for XML Query Optimization. In Proceedings of the 19th International Conference on Data Engineering, ICDE 2003, pages 443--454. IEEE Computer Society, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  33. C. Zhang, J. Naughton, D. DeWitt, Q. Luo, and G. Lohman. On Supporting Containment Queries in Relational Database Management Systems. In Proceedings of the ACM International Conference on Management of Data, SIGMOD 2001, pages 425--436, New York, USA, 2001. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A cost-based join selection for XML twig content-based queries

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              DataX '08: Proceedings of the 2008 EDBT workshop on Database technologies for handling XML information on the web
              March 2008
              76 pages
              ISBN:9781595939661
              DOI:10.1145/1416691

              Copyright © 2008 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 25 March 2008

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader