Abstract
As XML has become the de facto standard for data presentation and exchanging on the Web, XML query optimization has emerged as an important research issue. It is widely accepted that structural joins, which evaluate the containment (ancestor-descendant) relationships between XML elements, are important to the XML query processing. Estimating structural join size accurately and quickly thus becomes crucial to the success of XML query plan selection. In this paper, we propose to apply Cosine transform to structural join size estimation. Our approach captures structural information of XML data using mathematical functions, which are then approximated by the Cosine series. We derive a simple formula to estimate the structural join size using the Cosine series. Theoretical analyses and extensive experiments have been performed. The experimental results show that, compared with state-of-the-art IM-DA-Est method, our method is several order faster, requires less memory, and yields better or comparable estimates.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aboulnaga, A., Alameldeen, A.R., Naughton, J.F.: Estimating the selectivity of XML path expressions for internet scale applications. In: Proceedings of 27th International Conference on Very Large Data Bases, pp. 591–600 (2001)
Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Srivastava, D., Wu, Y.: Structural joins: A primitive for efficient XML query pattern matching. In: ICDE, pp. 141–152 (2002)
Chamberlin, D., Florescu, D., Robie, J., Simeon, J., Stefanescu, M.: XQuery 1.0: An XML Query Language. W3C Working Draft (2004), http://www.w3.org/TR/xquery/
Chen, Z., Jagadish, H.V., Korn, F., Koudas, N., Muthukrishnan, S., Ng, R.T., Srivastava, D.: Couting twig matches in a tree. In: Proceedings of the 17th International Conference on Data Engineering, pp. 595–604 (2001)
Clark, J., DeRose, S.: XML Path Language (XPath). W3C Working Draft (1999), http://www.w3.org/TR/xpath
DBLP data set, http://www.informatik.uni-trier.de/ley/db/index.html
Freire, J., Haritsa, J.R., Ramanath, M., Roy, P., Siméon, J.: Statix: making XML count. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 181–191 (2002)
Issacson, E., Keller, H.B.: Analysis of Numerical Methods Theorem 3, p. 238. Dover Publications, Mineola (1994)
Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: VLDB 2001, pp. 361–370 (2001)
Luo, C., Jiang, Z., Hou, W-C., Zhu, Q., Wang, C-F.: Applying the Cosine Series to XML Structural Join Size Estimation at, http://www.cs.siu.edu/~cluo/Estimate.pdf
McHugh, J., Widom, J.: Optimizing branching path expressions. In: VLDB, pp. 315–326 (1999)
Paparizos, S., Al-Khalifa, S., Chapman, A., Jagadish, H.V., Lakshmanan, L.V.S., Nierman, A., Patel, J.M., Srivastava, D., Wiwatwattana, N., Wu, Y., Yu, C.: TIMBER: A Native System for Querying XML. VLDB J. 11(4), 274–291 (2002)
Polyzotis, N., Garofalakis, M.N.: Statistical synopses for graph-structured XML databases. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 358–369 (2002)
Schmidt, A., Waas, F., Kersten, M., Florescu, D., Manolescu, L., Carey, M.J., Busse, R.: The XML benchmark project. Technical report CWI (2001)
Wang, W., Jiang, H., Lu, H., Yu, J.X.: Containment join size estimation: models and methods. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 145–156 (2003)
Wu, Y., Patel, J.M., Jagadish, H.V.: Estimating answer sizes for XML queries. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 590–608. Springer, Heidelberg (2002)
Zhang, C., Naughton, J.F., DeWitt, D.J., Luo, Q., Lohman, G.M.: On supporting containment queries in relational database management systems. In: SIGMOD (2001)
Jiang, H., Lu, H., Wang, W., Ooi, B.: XR-Tree: Indexing XML Data for Efficient Structural Join. In: Proc. of ICDE, India, pp. 253–264 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Luo, C., Jiang, Z., Hou, WC., Zhu, Q., Wang, CF. (2006). Applying Cosine Series to XML Structural Join Size Estimation. In: Bressan, S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2006. Lecture Notes in Computer Science, vol 4080. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11827405_74
Download citation
DOI: https://doi.org/10.1007/11827405_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37871-6
Online ISBN: 978-3-540-37872-3
eBook Packages: Computer ScienceComputer Science (R0)