Abstract
XML DBMSs require new indexing techniques to efficiently process structural search and full-text search as integrated in XQuery. Much research has been done for indexing XML documents. In this paper we first survey some of them and suggest a classification scheme. It appears that most techniques are indexing on paths in XML documents and maintain a separated index on values. In some cases, the two indexes are merged and/or tags are encoded. We propose a new method that indexes XML documents on ordered trees, i.e., two documents are in the same equivalence class is they have the same tree structure, with identical elements in order. We develop a simple benchmark to compare our method with two well-known European products. The results show that indexing on full trees leads to smaller index size and achieves 1 to 10 times better query performance in comparison with classical industrial methods that are path-based.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abiteboul, S., Cluet, S., Ferran, G., Rousset, M.-C.: The Xyleme project. Computer Networks 39(3), 225–238 (2002)
Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Srivastava, D., Wu, Y.: Structural joins: A primitive for efficient XML query pattern matching. In: Proceedings of the IEEE International Conference on Data Engineering (2002)
Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: Optimal xml pattern matching. In: Proc. of ACM SIGMOD (2002)
Chen, Q., Lim, A., Ong, K.W.: D(k)-index: An adaptive structural summary for graph-structured data. In: Proc. of SIGMOD (2003)
Chien, S.-Y., Vagena, Z., Zhang, D., Tsotras, V.J., Zaniolo, C.: Efficient structural joins on indexed xml documents. In: Proc. of VLDB (2002)
Chung, C.-W., Min, J.-K., Shim, K.: APEX: an adaptive path index for XML data. In: SIGMOD Conference 2002, pp. 121–132 (2002)
Cooper, B., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A Fast Index for Semistructured Data. In: VLDB 2001, pp. 341–350 (2001)
Dietz, P.F.: Maintaining order in a linked list. In: ACM Symposium on Theory of Computing, pp. 122–127 (1982)
Florescu, D., Kossmann, D.: Storing and Querying XML Data using an RDMBS. IEEE Data Eng. Bull. 22(3), 27–34 (1999)
Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: VLDB 1997, pp. 436–445 (1997)
Jiang, H., Lu, H., Wang, W., Ooi, B.C.: XR-Tree: Indexing XML Data for Efficient Structural Joins. In: ICDE 2003, pp. 253–263 (2003)
Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting slocal similarity for indexing paths in graph-structured data. In: Proc. of ICDE (2002)
Kilpeläinen, P.: Tree Matching Problems with Applications to Structured Text Databases. PHD Dissertation, University of Helsinki (1992)
Lee, Y.K., Yoo, S.-J., Yoon, K., Berra, P.B.: Index Structures for Structured Documents. In: Digital Libraries 1996, pp. 91–99 (1996)
Li, Q., Moon, B.: Indexing and Querying XML Data for Regular Path Expressions. In: VLDB 2001, pp. 361–370 (2001)
Lore, a DBMS for XML, http://www-db.stanford.edu/lore/
Milo, T., Suciu, D.: Index Structures for Path Expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)
Rao, P.R., Moon, B.: Prix: Indexing and querying xml using prufer sequences. In: Proc. of ICDE (2004)
Wang, H., Park, S., Fan, W., Yu, P.S.: ViST: A dynamic index method for querying xml data by tree structures. In: Proc. of ACM SIGMOD (2003)
X-Hive/DB: Advanced XML data processing and storage, http://www.x-hive.com
Yan, X., Yu, P., Han, J.: Graph Indexing: A Frequent Structure-based Approach. In: SIGMOD 2004, pp. 335–346 (2004)
Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On supporting containment queries in relational database management systems. In: Proc. of SIGMOD (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gardarin, G., Yeh, L. (2005). SIOUX: An Efficient Index for Processing Structural XQueries. In: Andersen, K.V., Debenham, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2005. Lecture Notes in Computer Science, vol 3588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11546924_55
Download citation
DOI: https://doi.org/10.1007/11546924_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28566-3
Online ISBN: 978-3-540-31729-6
eBook Packages: Computer ScienceComputer Science (R0)