Skip to main content

Reducing Graph Matching to Tree Matching for XML Queries with ID References

  • Conference paper
Database and Expert Systems Applications (DEXA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6262))

Included in the following conference series:

Abstract

ID/IDREF is an important and widely used feature in XML documents for eliminating data redundancy. Most existing algorithms consider an XML document with ID references as a graph and perform graph matching for queries involving ID references. Graph matching naturally brings higher complexity compared with original tree matching algorithms that process XML queries. In this paper, we make use of semantics of ID/IDREF to reduce graph matching to tree matching to process queries involving ID references. Using our approach, an XML document with ID/IDREF is not treated as a graph, and a general query with ID references will be decomposed and processed using tree pattern matching techniques, which are more efficient than graph matching. Furthermore, our approach is able to handle complex ID references, such as cyclic references and sequential references, which cannot be handled efficiently by existing approaches. The experimental results show that our approach is 20-50% faster than MonetDB, an XQuery engine, and at least 100 times faster than TwigStackD, an existing graph matching algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. http://www.geneontology.org/

  2. MonetDB, http://monetdb.cwi.nl/

  3. XMark. An XML benchmark project (2001), http://www.xml-benchmark.org

  4. Berglund, A., Chamberlin, D., Fernandez, M.F., Kay, M., Robie, J., Simeon, J.: XML path language XPath 2.0. W3C Working Draft (2003)

    Google Scholar 

  5. Boag, S., Chamberlin, D., Fernandez, M.F., Florescu, D., Robie, J., Simeon, J.: XQuery 1.0: An XML query. W3C Working Draft (2003)

    Google Scholar 

  6. Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: SIGMOD, pp. 310–321 (2002)

    Google Scholar 

  7. Chen, L., Gupta, A., Kurul, M.E.: Stack-based algorithms for pattern matching on dags. In: VLDB, pp. 493–504 (2005)

    Google Scholar 

  8. Chen, S., Li, H.-G., Tatemura, J., Hsiung, W.-P., Agrawal, D., Candan, K.S.: Twig2stack: Bottom-up processing of generalized-tree-pattern queries over XML documents. In: VLDB, pp. 283–294 (2006)

    Google Scholar 

  9. Chen, T., Lu, J., Ling, T.W.: On boosting holism in XML twig pattern matching using structural indexing techniques. In: SIGMOD, pp. 455–466 (2005)

    Google Scholar 

  10. Deutsch, A., Fernandez, M.F., Suciu, D.: Storing semistructured data with STORED. In: SIGMOD Conference, pp. 431–442 (1999)

    Google Scholar 

  11. Fan, W., Yu, J.X., Lu, H., Lu, J., Rastogi, R.: Query translation from XPath to SQL in the presence of recursive DTDs. In: VLDB, pp. 337–348 (2005)

    Google Scholar 

  12. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, New York (1979)

    MATH  Google Scholar 

  13. Gou, G., Chirkova, R.: Efficiently querying large XML data repositories: A survey. IEEE Trans. Knowl. Data Eng. 19(10), 1381–1403 (2007)

    Article  Google Scholar 

  14. Grust, T., Keulen, M.V., Teubner, J.: Accelerating XPath evaluation in any RDBMS. ACM Trans. Database Syst. 29(1), 91–131 (2004)

    Article  Google Scholar 

  15. Jiang, H., Lu, H., Wang, W.: Efficient processing of twig queries with or-predicates. In: SIGMOD, pp. 59–70 (2004)

    Google Scholar 

  16. Jiang, M.: Querying XML data: efficiency and security issues. Ph.D. Thesis. The Chinese University of Hong Kong

    Google Scholar 

  17. Kimelfeld, B., Sagiv, Y.: Twig patterns: from XML trees to graphs. In: WebDB, pp. 26–31 (2006)

    Google Scholar 

  18. Krishnamurthy, R., Chakaravarthy, V.T., Kaushik, R., Naughton, J.F.: Recursive XML schemas, recursive XML queries, and relational storage: XML-to-SQL query translation. In: ICDE, pp. 42–53 (2004)

    Google Scholar 

  19. Lu, J., Ling, T.W., Chan, C.Y., Chen, T.: From region encoding to extended dewey: On efficient processing of XML twig pattern matching. In: VLDB, pp. 193–204 (2005)

    Google Scholar 

  20. Morgenthal, J., Evdemon, J.: Eliminating redundancy in XML using ID/IDREF. XML Journal 1(4) (2000)

    Google Scholar 

  21. Shanmugasundaram, J., Tufte, K., Zhang, C., He, G., DeWitt, D.J., Naughton, J.F.: Relational databases for querying XML documents: Limitations and opportunities. In: VLDB, pp. 302–314 (1999)

    Google Scholar 

  22. Shasha, D., Wang, J.T.-L., Giugno, R.: Algorithmics and applications of tree and graph searching. In: PODS, pp. 39–52 (2002)

    Google Scholar 

  23. Vagena, Z., Moro, M.M., Tsotras, V.J.: Twig query processing over graph-structured XML data. In: WebDB, pp. 43–48 (2004)

    Google Scholar 

  24. Wang, H., Li, J., Luo, J., Gao, H.: Hash-based subgraph query processing method for fraph-structured XML documents. In: VLDB, pp. 478–489 (2008)

    Google Scholar 

  25. Wu, H., Ling, T.W., Chen, B.: VERT: A semantic approach for content search and content extraction in XML query processing. In: Parent, C., Schewe, K.-D., Storey, V.C., Thalheim, B. (eds.) ER 2007. LNCS, vol. 4801, pp. 534–549. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  26. Yoshikawa, M., Amagasa, T., Shimura, T., Uemura, S.: XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM Trans. Internet Techn. 1(1), 110–141 (2001)

    Article  Google Scholar 

  27. Zhang, C., Naughton, J.F., DeWitt, D.J., Luo, Q., Lohman, G.M.: On supporting containment queries in relational database management systems. In: SIGMOD Conference, pp. 425–436 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, H., Ling, T.W., Dobbie, G., Bao, Z., Xu, L. (2010). Reducing Graph Matching to Tree Matching for XML Queries with ID References. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds) Database and Expert Systems Applications. DEXA 2010. Lecture Notes in Computer Science, vol 6262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15251-1_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15251-1_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15250-4

  • Online ISBN: 978-3-642-15251-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics