Abstract
One approach to building an efficient XML query processor is to use RDBMSs to store and query XML documents. XML queries contain a number of features that are either hard to translate into SQLs or for which the resulting SQL is complex and inefficient. Among them, path expressions pose a new challenge for efficient XML query processing in RDBMSs. Building index structures for path expressions is necessary. Meanwhile, indexes occupy much disk space. There is a tradeoff between the consumption of disk space and the efficiency of query evaluation. In this paper, we present a cost model for the space consumption of indexes and their benefit to XML queries. Making use of the statistics of XML data and the characteristics of the target application, we adopt greedy algorithm to select some map indexes to be built. Our experimental study demonstrates that query performance get comparatively significant improvement over the case without indexes while only consuming disk space of modest size.
The work was supported by the Hi-Tech Research and Development Program of China under grant No. 2002AA413110. Shuigeng Zhou was also supported by the Hi-Tech Research and Development Program of China under grant No. 2002AA135340 and partially supported by the Open Research Fund Program of State Key Lab of Software Engineering of China under grant No. SKL(4)003.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abiteboul, S.: Querying Semi-Structured Data. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 1–18. Springer, Heidelberg (1996)
Florescu, D., Kossmann, D.: A Performance Evaluation of Alternative Mapping Schemes for Storing XML Data in a Relational Database. Technical Report 3680, INRIA (1999)
Shanmugasundaram, J., Tufte, K., Zhang, C., et al.: Relational Databases for Querying XML Documents: Limitations and Opportunities. In: Proc. of VLDB 1999, pp. 302–314 (1999)
Tian, F., DeWitt, D.J., Chen, J., et al.: The Design and Performance Evaluation of Alternative XML Storage Strategies. SIGMOD Record Special Issue on Data Management Issues in E-commerce (March 2002)
Abiteboul, S., Quass, D., Mchugh, J., et al.: The Lore Query Language for Semistructured Data. International Journal on Digital Libraries 1(1), 68–88 (1997)
Deutsch, A., Fernandez, M., Florescu, D., et al.: XML-QL: A Query Language for XML. W3C Note (1998), http://www.w3.org/TR/1998/NOTE-xml-ql-19980819
Clark, J., DeRose, S.: XML Path Language (XPath). W3C Recommendation (1999), http://www.w3.org/TR/xpath
Boag, S., Chamberlin, D., Fernandez, M.F., et al.: XQuery 1.0: An XML Query Language. W3C Working Draft (2002), http://www.w3.org/TR/xquery
Valduriez, P.: Join Indices. TODS 12(2), 218–246 (1987)
Kemper, A., Moerkotte, G.: Access Support in Object Bases. In: Proc. of SIGMOD 1990, pp. 364–374 (1990)
Han, J., Xie, Z., Fu, Y.: Join Index Hierarchy: An Indexing Structure for Efficient Navigation in Object-Oriented Databases. TKDE 11(2), 321–337 (1999)
Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: Proc. of VLDB, pp. 436–445 (1997)
McHugh, J., Abiteboul, S., Goldman, R., et al.: Lore: A Database Management System for Semistructured Data. SIGMOD Record 26(3), 54–66 (1997)
Milo, T., Suciu, D.: Index Structures for Path Expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)
Zheng, S., Zhou, A., Yu, J.X., et al.: Structural Map: A New Index for Efficient XML Path Expression Processing. In: Meng, X., Su, J., Wang, Y. (eds.) WAIM 2002. LNCS, vol. 2419, p. 25. Springer, Heidelberg (2002)
Bohannon, P., Freire, J., Roy, P., et al.: From XML Schema to Relations: A Cost- Based Approach to XML Storage. In: Proc. of ICDE 2002 (2002)
Bohme, T., Rahm, E.: XMach-1: A Benchmark for XML Data Management. In: Proc. of GDC (2001)
Carey, M.J., Kiernan, J., Shanmugasundaram, J., et al.: XPERANTO: Middleware for publishing object-relational data as XML documents. In: Proc. of VLDB, pp. 646–648 (2000)
Fernandez, M.F., Tan, W.C., Suciu, D.: SilkRoute: Trading between Relations and XML. WWW9/Computer Networks 33(1-6), 723–745 (2000)
Chang, C.Y., Chen, M.S.: Exploring Aggregate Effect withWeighted Transcoding Graphs for Efficient Cache Replacement in Transcoding Proxies. In: Proc. of ICDE (2002)
Zhou, A., Lu, H., Zheng, S., et al.: VXMLR: A Visual XML-Relational Database System. In: Proc. of VLDB (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Guo, Z., Xu, Z., Zhou, S., Zhou, A., Li, M. (2003). Index Selection for Efficient XML Path Expression Processing. In: Jeusfeld, M.A., Pastor, Ó. (eds) Conceptual Modeling for Novel Application Domains. ER 2003. Lecture Notes in Computer Science, vol 2814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39597-3_27
Download citation
DOI: https://doi.org/10.1007/978-3-540-39597-3_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20257-8
Online ISBN: 978-3-540-39597-3
eBook Packages: Springer Book Archive