Abstract
Over the recent five years, we have designed, implemented, and optimized our prototype system XTC, a native XDBMS providing multi-user read/write transactions and supporting multi-lingual query interfaces (XQuery, XPath, DOM, SAX). We have compared competing concepts in various system layers and iteratively found salient solutions which drastically improved the overall XDBMS performance. XML query processing is critically affected by the smooth interplay of concepts and methods. Here, we focus on the physical level of XML processing: node labeling and mapping options for storage structures; design of suitable index mechanisms; enriched functionality of path processing operators, in particular, for holistic twig joins. In this survey, we outline our experiences gained during the evolution of XTC. We develop “key concepts” to enable fine-grained, effective, and efficient XML processing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amer-Yahia, S., Du, F., Freire, J.: A Comprehensive Solution to the XML-to-Relational Mapping Problem. In: Proc. WIDM, pp. 31–38 (2004)
Arion, A., Bonifati, A., Manolescu, I., Pugliese, A.: Path Summaries and Path Partitioning in Modern XML Databases. World Wide Web 11(1), 117–151 (2008)
Beyer, K., et al.: System RX: One Part relational, One Part XML. In: Proc. SIGMOD, pp. 347–358 (2005)
Beyer, K., et al.: DB2 Goes Hybrid: Integrating Native XML and XQuery with Relational Data and SQL. IBM Systems Journal 45(2), 271–298 (2006)
Bohannon, P., Freire, J., Roy, P., Siméon, J.: From XML Schema to Relations: A Cost-Based Approach to XML Storage. In: Proc. ICDE, pp. 64–73 (2002)
Boncz, P., Grust, T., van Keulen, M., Manegold, S., Rittinger, J., Teubner, J.: MonetDB/XQuery: A Fast XQuery Processor Powered by a Relational Engine. In: Proc. SIGMOD, pp. 479–490 (2006)
Bruno, N., Koudas, N., Srivastava, D.: Holistic Twig Joins: Optimal XML Pattern Matching. In: Proc. SIGMOD, pp. 310–321 (2002)
Chen, Q., Lim, A., Ong, K.W.: D(k)-Index: An Adaptive Structural Summary for Graph-Structured Data. In: Proc. SIGMOD, pp. 134–144 (2003)
Chen, Y., Davidson, S., Hara, C., Zheng, Y.: RRXS: Redundancy Reducing XML Storage in Relations. In: Proc. VLDB, pp. 189–200 (2003)
Chen, T., Lu, J., Ling, T.W.: On Boosting Holism in XML Twig Pattern Matching Using Structural Indexing Techniques. In: Proc. SIGMOD, pp. 455–466 (2005)
Chen, S., Li, H.-G., Tatemura, J., Hsiung, W.-P., Agrawal, D., Selçuk Candan, K.: Twig2Stack: Bottom-Up Processing of Generalized-Tree-Pattern Queries over XML Documents. In: Proc. VLDB, pp. 283–294 (2006)
Cooper, B., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A Fast Index for Semistructured Data. In: Proc. VLDB, pp. 341–350 (2001)
DeHaan, D., Toman, D., Consens, M.P., Özsu, M.T.: A Comprehensive XQuery to SQL Translation using Dynamic Interval Encoding. In: Proc. SIGMOD, pp. 623–634 (2003)
Fiebig, T., et al.: Anatomy of a Native XML Base Management System. VLDB Journal 11(4), 292–314 (2002)
Florescu, D., Kossmann, D.: Storing and Querying XML Data using an RDBMS. Bulletin of the Technical Committee on Data Engineering 22(3), 27–34 (1999)
Fontoura, M., Josifovski, V., Shekita, E.J., Yang, B.: Optimizing Cursor Movement in Holistic Twig Joins. In: Proc. CIKM, pp. 784–791 (2005)
Georgiadis, H., Vassalos, V.: XPath on Steroids: Exploiting Relational Engines for XPath Performance. In: Proc. SIGMOD, pp. 317–328 (2007)
Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: Proc. VLDB, pp. 436–445 (1997)
Grinev, M., Fomichev, A., Kuznetsov, S.: Sedna: A Native XML DBMS. In: Wiedermann, J., Tel, G., Pokorný, J., Bieliková, M., Štuller, J. (eds.) SOFSEM 2006. LNCS, vol. 3831, pp. 272–281. Springer, Heidelberg (2006)
Härder, T., Haustein, M.P., Mathis, C., Wagner, M.: Node Labeling Schemes for Dynamic XML Documents Reconsidered. Data & Knowledge Engineering 60(1), 126–149 (2007)
Härder, T., Mathis, C., Schmidt, K.: Comparison of Complete and Elementless Native Storage of XML Documents. In: Proc. IDEAS, pp. 102–113 (2007)
Härder, T., Mathis, C., Bächle, S., Schmidt, K., Weiner, A.M.: Essential Performance Drivers in Native XML DBMSs (keynote paper). In: van Leeuwen, J., Muscholl, A., Peleg, D., Pokorný, J., Rumpe, B. (eds.) SOFSEM 2010. LNCS, vol. 5901, pp. 29–46. Springer, Heidelberg (2010)
Haustein, M.P., Härder, T.: An Efficient Infrastructure for Native transactional XML Processing. Data & Knowledge Engineering 61(3), 500–523 (2007)
He, H., Yang, J.: Multiresolution Indexing of XML for Frequent Queries. In: Proc. ICDE, pp. 683–692 (2004)
Jagadish, H.V., et al.: TIMBER: A Native XML Database. VLDB Journal 11(4), 274–291 (2002)
Jiang, H., Lu, H., Wang, W., Yu, J.X.: Path Materialization Revisited: An Efficient Storage Model for XML Data. Australian Comp. Science Comm. 24(2), 85–94 (2002)
Jiang, H., Lu, H., Wang, W., Ooi, B.C.: XR-Tree: Indexing XML Data for Efficient Structural Joins. In: Proc. ICDE, 253–264 (2003)
Jiang, H., Wang, W., Lu, H., Yu, J.X.: Holistic Twig Joins on Indexed XML Documents. In: Proc. VLDB, pp. 273–284 (2003)
Jiao, E., Ling, T.W., Chan, C.Y.: PathStack¬: A Holistic Path Join Algorithm for Path Query with Not-Predicates on XML Data. In: Zhou, L.-z., Ooi, B.-C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 113–124. Springer, Heidelberg (2005)
Kaushik, R., Bohannon, P., Naughton, J.F., Korth, H.F.: Covering Indexes for Branching Path Queries. In: Proc. SIGMOD, pp. 133–144 (2002)
Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting Local Similarity for Indexing Paths in Graph-Structured Data. In: Proc. ICDE, pp. 129–138 (2002)
Kaushik, R., Krishnamurthy, R., Naughton, J.F., Ramakrishnan, R.: On the Integration of Structure Indexes and Inverted Lists. In: Proc. SIGMOD, pp. 779–790 (2004)
Kwon, J., Rao, P., Moon, B., Lee, S.: FiST: Scalable XML Document Filtering by Sequencing Twig Patterns. In: Proc. VLDB, pp. 217–228 (2005)
Lee, D., Chu, W.W.: Constraints-Preserving Transformation from XML Document Type Definition to Relational Schema. In: Laender, A.H.F., Liddle, S.W., Storey, V.C. (eds.) ER 2000. LNCS, vol. 1920, pp. 641–654. Springer, Heidelberg (2000)
Li, Q., Moon, B.: Indexing and Querying XML Data for Regular Path Expressions. In: Proc. VLDB, pp. 361–370 (2001)
Li, H.-G., Alireza Aghili, S., Agrawal, D., El Abbadi, A.: FLUX: Content and Structure Matching of XPath Queries with Range Predicates. In: Amer-Yahia, S., Bellahsène, Z., Hunt, E., Unland, R., Yu, J.X. (eds.) XSym 2006. LNCS, vol. 4156, pp. 61–76. Springer, Heidelberg (2006)
Li, C., Ling, T.W., Hu, M.: Efficient Updates in Dynamic XML Data: from Binary String to Quaternary String. VLDB Journal 17(3), 573–601 (2008)
Loeser, H., Nicola, M., Fitzgerald, J.: Index Challenges in Native XML Database systems. In: Proc. BTW. LNI, vol. 144, pp. 508–523 (2009)
Lu, J., Chen, T., Ling, T.W.: Efficient Processing of XML Twig Patterns with Parent Child Edges: a Look-Ahead Approach. In: Proc. CIKM, pp. 533–542 (2004)
Lu, J., Chen, T., Ling, T.W.: TJFast: Effective Processing of XML Twig Pattern Matching. In: Proc. WWW, pp. 1118–1119 (2005)
Mang, X., Wang, Y., Luo, D., Lu, S., An, J., Chen, Y., Ou, J., Jiang, Y.: OrientX: A Schema-based Native XML Database System. In: Proc. VDLB, pp. 1057–1060 (2003)
Mathis, C.: Storing, Indexing, and Querying XML Documents in Native XML Database Management Systems. Ph. D. Thesis, Verlag Dr. Hut (2009)
May, N., Brantner, M., Böhm, A., Kanne, C.-C., Moerkotte, G.: Index vs. Navigation in XPath Evaluation. In: Amer-Yahia, S., Bellahsène, Z., Hunt, E., Unland, R., Yu, J.X. (eds.) XSym 2006. LNCS, vol. 4156, pp. 16–30. Springer, Heidelberg (2006)
Mchugh, J., Abiteboul, S.: Lore: A Database Management System for Semistructured Data. In: SIGMOD Record, vol. 26, pp. 54–66 (1997)
Meier, W.: eXist: An Open Source Native XML Database. In: Chaudhri, A.B., Jeckle, M., Rahm, E., Unland, R. (eds.) NODe-WS 2002. LNCS, vol. 2593, pp. 169–183. Springer, Heidelberg (2003)
Miklau, G.: XML Data Repository (Feburary 2009), http://www.cs.washington.edu/research/xmldatasets/
Milo, T., Suciu, D.: Index Structures for Path Expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)
O’Neil, P., O’Neil, E., Pal, S., Cseri, I., Schaller, G., Westbury, N.: ORDPATHs: Insert-Friendly XML Node Labels. In: Proc. SIGMOD, pp. 903–908 (2004)
Prakash, S., Bhowmick, S.S., Madria, S.: Efficient Recursive XML Query Processing Using Relational Database Systems. Data & Knowledge Engineering 58(3), 207–242 (2006)
Hima Prasad, K., Sreenivasa Kumar, P.: Efficient Indexing and Querying of XML Data Using Modified Prüfer Sequences. In: Proc. CIKM, pp. 397–404 (2005)
Financial XML Projects.: XML on Wall Street (2008), http://lighthouse-partners.com/xml
Qin, L., Yu, J.X., Ding, B.: TwigList: Make Twig Pattern Matching Fast. In: Proc. DASFAA, pp. 850–862 (2007)
Rao, P., Moon, B.: PRIX: Indexing And Querying XML Using Prüfer Sequences. In: Proc. ICDE, pp. 288–297 (2004)
Schmidt, K., Härder, T.: Usage-driven Storage Structures for Native XML Databases. In: Proc. IDEAS, pp. 169–178 (2008)
Schmidt, K., Härder, T.: On the Use of Query-driven XML Auto-Indexing. In: Proc. SMDB Workshop, Long Beach, pp. 1–6 (2010)
Tatarinov, I., et al.: Storing and Querying Ordered XML Using a Relational Database System. In: Proc SIGMOD, pp. 204–215 (2002)
Wang, H., Park, S., Fan, W., Yu, P.S.: ViST: A Dynamic Index Method for Querying XML Data by Tree Structures. In: Proc. SIGMOD, pp. 110–121 (2003)
Wang, W., Jiang, H., Wang, H., Lin, X., Lu, H., Li, J.: Efficient processing of XML Path Queries Using the Disk-Based F&B Index. In: Proc. VLDB, pp. 145–156 (2005)
Yoshikawa, M., et al.: XRel: A Path-Based Approach to Storage and Retrieval of XML Documents Using Relational Databases. ACM Transact. on Internet Technology 1(1), 110–141 (2001)
Yu, T., Ling, T.W., Lu, J.: TwigStackList¬: A Holistic Twig Join Algorithm for Twig Query with Not-Predicates on XML Data. In: Li Lee, M., Tan, K.-L., Wuwongse, V. (eds.) DASFAA 2006. LNCS, vol. 3882, pp. 249–263. Springer, Heidelberg (2006)
Zhang, N., Kacholia, V., Tamer Özsu, M.: A Succinct Physical Storage Scheme for Efficient Evaluation of Path Queries in XML. In: Proc. ICDE, pp. 54–63 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Härder, T., Mathis, C. (2010). Key Concepts for Native XML Processing. In: Sachs, K., Petrov, I., Guerrero, P. (eds) From Active Data Management to Event-Based Systems and More. Lecture Notes in Computer Science, vol 6462. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17226-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-17226-7_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17225-0
Online ISBN: 978-3-642-17226-7
eBook Packages: Computer ScienceComputer Science (R0)