Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6462))

  • 2312 Accesses

Abstract

Over the recent five years, we have designed, implemented, and optimized our prototype system XTC, a native XDBMS providing multi-user read/write transactions and supporting multi-lingual query interfaces (XQuery, XPath, DOM, SAX). We have compared competing concepts in various system layers and iteratively found salient solutions which drastically improved the overall XDBMS performance. XML query processing is critically affected by the smooth interplay of concepts and methods. Here, we focus on the physical level of XML processing: node labeling and mapping options for storage structures; design of suitable index mechanisms; enriched functionality of path processing operators, in particular, for holistic twig joins. In this survey, we outline our experiences gained during the evolution of XTC. We develop “key concepts” to enable fine-grained, effective, and efficient XML processing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amer-Yahia, S., Du, F., Freire, J.: A Comprehensive Solution to the XML-to-Relational Mapping Problem. In: Proc. WIDM, pp. 31–38 (2004)

    Google Scholar 

  2. Arion, A., Bonifati, A., Manolescu, I., Pugliese, A.: Path Summaries and Path Partitioning in Modern XML Databases. World Wide Web 11(1), 117–151 (2008)

    Article  Google Scholar 

  3. Beyer, K., et al.: System RX: One Part relational, One Part XML. In: Proc. SIGMOD, pp. 347–358 (2005)

    Google Scholar 

  4. Beyer, K., et al.: DB2 Goes Hybrid: Integrating Native XML and XQuery with Relational Data and SQL. IBM Systems Journal 45(2), 271–298 (2006)

    Article  Google Scholar 

  5. Bohannon, P., Freire, J., Roy, P., Siméon, J.: From XML Schema to Relations: A Cost-Based Approach to XML Storage. In: Proc. ICDE, pp. 64–73 (2002)

    Google Scholar 

  6. Boncz, P., Grust, T., van Keulen, M., Manegold, S., Rittinger, J., Teubner, J.: MonetDB/XQuery: A Fast XQuery Processor Powered by a Relational Engine. In: Proc. SIGMOD, pp. 479–490 (2006)

    Google Scholar 

  7. Bruno, N., Koudas, N., Srivastava, D.: Holistic Twig Joins: Optimal XML Pattern Matching. In: Proc. SIGMOD, pp. 310–321 (2002)

    Google Scholar 

  8. Chen, Q., Lim, A., Ong, K.W.: D(k)-Index: An Adaptive Structural Summary for Graph-Structured Data. In: Proc. SIGMOD, pp. 134–144 (2003)

    Google Scholar 

  9. Chen, Y., Davidson, S., Hara, C., Zheng, Y.: RRXS: Redundancy Reducing XML Storage in Relations. In: Proc. VLDB, pp. 189–200 (2003)

    Google Scholar 

  10. Chen, T., Lu, J., Ling, T.W.: On Boosting Holism in XML Twig Pattern Matching Using Structural Indexing Techniques. In: Proc. SIGMOD, pp. 455–466 (2005)

    Google Scholar 

  11. Chen, S., Li, H.-G., Tatemura, J., Hsiung, W.-P., Agrawal, D., Selçuk Candan, K.: Twig2Stack: Bottom-Up Processing of Generalized-Tree-Pattern Queries over XML Documents. In: Proc. VLDB, pp. 283–294 (2006)

    Google Scholar 

  12. Cooper, B., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A Fast Index for Semistructured Data. In: Proc. VLDB, pp. 341–350 (2001)

    Google Scholar 

  13. DeHaan, D., Toman, D., Consens, M.P., Özsu, M.T.: A Comprehensive XQuery to SQL Translation using Dynamic Interval Encoding. In: Proc. SIGMOD, pp. 623–634 (2003)

    Google Scholar 

  14. Fiebig, T., et al.: Anatomy of a Native XML Base Management System. VLDB Journal 11(4), 292–314 (2002)

    Article  MATH  Google Scholar 

  15. Florescu, D., Kossmann, D.: Storing and Querying XML Data using an RDBMS. Bulletin of the Technical Committee on Data Engineering 22(3), 27–34 (1999)

    Google Scholar 

  16. Fontoura, M., Josifovski, V., Shekita, E.J., Yang, B.: Optimizing Cursor Movement in Holistic Twig Joins. In: Proc. CIKM, pp. 784–791 (2005)

    Google Scholar 

  17. Georgiadis, H., Vassalos, V.: XPath on Steroids: Exploiting Relational Engines for XPath Performance. In: Proc. SIGMOD, pp. 317–328 (2007)

    Google Scholar 

  18. Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: Proc. VLDB, pp. 436–445 (1997)

    Google Scholar 

  19. Grinev, M., Fomichev, A., Kuznetsov, S.: Sedna: A Native XML DBMS. In: Wiedermann, J., Tel, G., Pokorný, J., Bieliková, M., Štuller, J. (eds.) SOFSEM 2006. LNCS, vol. 3831, pp. 272–281. Springer, Heidelberg (2006)

    Google Scholar 

  20. Härder, T., Haustein, M.P., Mathis, C., Wagner, M.: Node Labeling Schemes for Dynamic XML Documents Reconsidered. Data & Knowledge Engineering 60(1), 126–149 (2007)

    Article  Google Scholar 

  21. Härder, T., Mathis, C., Schmidt, K.: Comparison of Complete and Elementless Native Storage of XML Documents. In: Proc. IDEAS, pp. 102–113 (2007)

    Google Scholar 

  22. Härder, T., Mathis, C., Bächle, S., Schmidt, K., Weiner, A.M.: Essential Performance Drivers in Native XML DBMSs (keynote paper). In: van Leeuwen, J., Muscholl, A., Peleg, D., Pokorný, J., Rumpe, B. (eds.) SOFSEM 2010. LNCS, vol. 5901, pp. 29–46. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  23. Haustein, M.P., Härder, T.: An Efficient Infrastructure for Native transactional XML Processing. Data & Knowledge Engineering 61(3), 500–523 (2007)

    Article  Google Scholar 

  24. He, H., Yang, J.: Multiresolution Indexing of XML for Frequent Queries. In: Proc. ICDE, pp. 683–692 (2004)

    Google Scholar 

  25. Jagadish, H.V., et al.: TIMBER: A Native XML Database. VLDB Journal 11(4), 274–291 (2002)

    Article  MATH  Google Scholar 

  26. Jiang, H., Lu, H., Wang, W., Yu, J.X.: Path Materialization Revisited: An Efficient Storage Model for XML Data. Australian Comp. Science Comm. 24(2), 85–94 (2002)

    Google Scholar 

  27. Jiang, H., Lu, H., Wang, W., Ooi, B.C.: XR-Tree: Indexing XML Data for Efficient Structural Joins. In: Proc. ICDE, 253–264 (2003)

    Google Scholar 

  28. Jiang, H., Wang, W., Lu, H., Yu, J.X.: Holistic Twig Joins on Indexed XML Documents. In: Proc. VLDB, pp. 273–284 (2003)

    Google Scholar 

  29. Jiao, E., Ling, T.W., Chan, C.Y.: PathStack¬: A Holistic Path Join Algorithm for Path Query with Not-Predicates on XML Data. In: Zhou, L.-z., Ooi, B.-C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 113–124. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  30. Kaushik, R., Bohannon, P., Naughton, J.F., Korth, H.F.: Covering Indexes for Branching Path Queries. In: Proc. SIGMOD, pp. 133–144 (2002)

    Google Scholar 

  31. Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting Local Similarity for Indexing Paths in Graph-Structured Data. In: Proc. ICDE, pp. 129–138 (2002)

    Google Scholar 

  32. Kaushik, R., Krishnamurthy, R., Naughton, J.F., Ramakrishnan, R.: On the Integration of Structure Indexes and Inverted Lists. In: Proc. SIGMOD, pp. 779–790 (2004)

    Google Scholar 

  33. Kwon, J., Rao, P., Moon, B., Lee, S.: FiST: Scalable XML Document Filtering by Sequencing Twig Patterns. In: Proc. VLDB, pp. 217–228 (2005)

    Google Scholar 

  34. Lee, D., Chu, W.W.: Constraints-Preserving Transformation from XML Document Type Definition to Relational Schema. In: Laender, A.H.F., Liddle, S.W., Storey, V.C. (eds.) ER 2000. LNCS, vol. 1920, pp. 641–654. Springer, Heidelberg (2000)

    Google Scholar 

  35. Li, Q., Moon, B.: Indexing and Querying XML Data for Regular Path Expressions. In: Proc. VLDB, pp. 361–370 (2001)

    Google Scholar 

  36. Li, H.-G., Alireza Aghili, S., Agrawal, D., El Abbadi, A.: FLUX: Content and Structure Matching of XPath Queries with Range Predicates. In: Amer-Yahia, S., Bellahsène, Z., Hunt, E., Unland, R., Yu, J.X. (eds.) XSym 2006. LNCS, vol. 4156, pp. 61–76. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  37. Li, C., Ling, T.W., Hu, M.: Efficient Updates in Dynamic XML Data: from Binary String to Quaternary String. VLDB Journal 17(3), 573–601 (2008)

    Article  Google Scholar 

  38. Loeser, H., Nicola, M., Fitzgerald, J.: Index Challenges in Native XML Database systems. In: Proc. BTW. LNI, vol. 144, pp. 508–523 (2009)

    Google Scholar 

  39. Lu, J., Chen, T., Ling, T.W.: Efficient Processing of XML Twig Patterns with Parent Child Edges: a Look-Ahead Approach. In: Proc. CIKM, pp. 533–542 (2004)

    Google Scholar 

  40. Lu, J., Chen, T., Ling, T.W.: TJFast: Effective Processing of XML Twig Pattern Matching. In: Proc. WWW, pp. 1118–1119 (2005)

    Google Scholar 

  41. Mang, X., Wang, Y., Luo, D., Lu, S., An, J., Chen, Y., Ou, J., Jiang, Y.: OrientX: A Schema-based Native XML Database System. In: Proc. VDLB, pp. 1057–1060 (2003)

    Google Scholar 

  42. Mathis, C.: Storing, Indexing, and Querying XML Documents in Native XML Database Management Systems. Ph. D. Thesis, Verlag Dr. Hut (2009)

    Google Scholar 

  43. May, N., Brantner, M., Böhm, A., Kanne, C.-C., Moerkotte, G.: Index vs. Navigation in XPath Evaluation. In: Amer-Yahia, S., Bellahsène, Z., Hunt, E., Unland, R., Yu, J.X. (eds.) XSym 2006. LNCS, vol. 4156, pp. 16–30. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  44. Mchugh, J., Abiteboul, S.: Lore: A Database Management System for Semistructured Data. In: SIGMOD Record, vol. 26, pp. 54–66 (1997)

    Google Scholar 

  45. Meier, W.: eXist: An Open Source Native XML Database. In: Chaudhri, A.B., Jeckle, M., Rahm, E., Unland, R. (eds.) NODe-WS 2002. LNCS, vol. 2593, pp. 169–183. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  46. Miklau, G.: XML Data Repository (Feburary 2009), http://www.cs.washington.edu/research/xmldatasets/

  47. Milo, T., Suciu, D.: Index Structures for Path Expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  48. O’Neil, P., O’Neil, E., Pal, S., Cseri, I., Schaller, G., Westbury, N.: ORDPATHs: Insert-Friendly XML Node Labels. In: Proc. SIGMOD, pp. 903–908 (2004)

    Google Scholar 

  49. Prakash, S., Bhowmick, S.S., Madria, S.: Efficient Recursive XML Query Processing Using Relational Database Systems. Data & Knowledge Engineering 58(3), 207–242 (2006)

    Article  Google Scholar 

  50. Hima Prasad, K., Sreenivasa Kumar, P.: Efficient Indexing and Querying of XML Data Using Modified Prüfer Sequences. In: Proc. CIKM, pp. 397–404 (2005)

    Google Scholar 

  51. Financial XML Projects.: XML on Wall Street (2008), http://lighthouse-partners.com/xml

  52. Qin, L., Yu, J.X., Ding, B.: TwigList: Make Twig Pattern Matching Fast. In: Proc. DASFAA, pp. 850–862 (2007)

    Google Scholar 

  53. Rao, P., Moon, B.: PRIX: Indexing And Querying XML Using Prüfer Sequences. In: Proc. ICDE, pp. 288–297 (2004)

    Google Scholar 

  54. Schmidt, K., Härder, T.: Usage-driven Storage Structures for Native XML Databases. In: Proc. IDEAS, pp. 169–178 (2008)

    Google Scholar 

  55. Schmidt, K., Härder, T.: On the Use of Query-driven XML Auto-Indexing. In: Proc. SMDB Workshop, Long Beach, pp. 1–6 (2010)

    Google Scholar 

  56. Tatarinov, I., et al.: Storing and Querying Ordered XML Using a Relational Database System. In: Proc SIGMOD, pp. 204–215 (2002)

    Google Scholar 

  57. Wang, H., Park, S., Fan, W., Yu, P.S.: ViST: A Dynamic Index Method for Querying XML Data by Tree Structures. In: Proc. SIGMOD, pp. 110–121 (2003)

    Google Scholar 

  58. Wang, W., Jiang, H., Wang, H., Lin, X., Lu, H., Li, J.: Efficient processing of XML Path Queries Using the Disk-Based F&B Index. In: Proc. VLDB, pp. 145–156 (2005)

    Google Scholar 

  59. Yoshikawa, M., et al.: XRel: A Path-Based Approach to Storage and Retrieval of XML Documents Using Relational Databases. ACM Transact. on Internet Technology 1(1), 110–141 (2001)

    Article  Google Scholar 

  60. Yu, T., Ling, T.W., Lu, J.: TwigStackList¬: A Holistic Twig Join Algorithm for Twig Query with Not-Predicates on XML Data. In: Li Lee, M., Tan, K.-L., Wuwongse, V. (eds.) DASFAA 2006. LNCS, vol. 3882, pp. 249–263. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  61. Zhang, N., Kacholia, V., Tamer Özsu, M.: A Succinct Physical Storage Scheme for Efficient Evaluation of Path Queries in XML. In: Proc. ICDE, pp. 54–63 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Härder, T., Mathis, C. (2010). Key Concepts for Native XML Processing. In: Sachs, K., Petrov, I., Guerrero, P. (eds) From Active Data Management to Event-Based Systems and More. Lecture Notes in Computer Science, vol 6462. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17226-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17226-7_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17225-0

  • Online ISBN: 978-3-642-17226-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics