Abstract
XML query processing is one of the most active areas of database research. Although the main focus of past research has been the processing of structural XML queries, there are growing demands for a full-text search for XML documents. In this paper, we propose XICS (XML Indices for Content and Structural search), novel indices built on a B+-tree, for the fast processing of queries that involve structural and fulltext searches of XML documents. To represent the structural information of XML trees, each node in the XML tree is labeled with an identifier. The identifier contains an integer number representing the path information from the root node. XICS consist of two types of indices, the COB-tree (COntent B+-tree) and the STB-tree (STructure B+-tree). The search keys of the COB-tree are a pair of text fragments in the XML document and the identifiers of the leaf nodes that contain the text, whereas the search keys of the STB-tree are the node identifiers. By using a node identifier in the search keys, we can retrieve only the entries that match the path information in the query. Our experimental results show the efficiency of XICS in query processing.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
W3C, XPath 1.0 (1999), http://www.w3.org/TR/xpath
W3C, XQuery 1.0 (2005), http://www.w3.org/TR/xquery/
Li, Q., Moon, B.: Indexing and Querying XML Data for Regular Path Expressions. In: VLDB, September 2001, pp. 361–370 (2001)
Jiang, H., Lu, H., Wang, W., Ooi, B.C.: XR-Tree: Indexing XML Data for Efficient Structural Joins. In: ICDE, March 2003, pp. 253–264 (2003)
Wang, H., Meng, X.: On the Sequencing of Tree Structures for XML Indexing. In: ICDE, April 2005, pp. 372–383 (2005)
Goldman, R., Widom, J.: Dataguides: Enabling query formulation and optimization in semistrucutred databases. In: VLDB, August 1997, pp. 436–445 (1997)
Cooper, B., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A Fast Index for Semistructured Data. In: VLDB, September 2001, pp. 341–350 (2001)
Wu, X., Lee, M.L., Hsu, W.: A Prime Number Labeling Scheme for Dynamic Ordered XML Trees. In: ICDE, March 2004, pp. 66–78 (2004)
Hammerschmidt, B.C., Kempa, M., Linnemann, V.: A selective key-oriented XML Index for the Index Selection Problem in XDBMS. In: Galindo, F., Takizawa, M., Traunmüller, R. (eds.) DEXA 2004. LNCS, vol. 3180, pp. 273–284. Springer, Heidelberg (2004)
W3C, XQuery 1.0 and XPath 2.0 Full-Text Use Cases (April 2005), http://www.w3.org/TR/xmlquery-full-text-use-cases/
Amer-Yahia, S., Botev, C., Shanmugasundaram, J.: TeXQuery: A Full-Text Search Extension to XQuery. In: WWW, May 2004, pp. 583–594 (2004)
Hellerstein, J.M., Naughton, J.F., Pfeffer, A.: Generalized Search Trees for Database Systems. In: VLDB, September 1995, pp. 562–573 (1995)
INitiative for the Evaluation of XML Retrieval (INEX), http://inex.is.informatik.uni-duisburg.de/2005/
Kaushik, R., Krishnamurthy, R., Naughton, J.F., Ramakrishnan, R.: On the Integration of Structure Indexes and Inverted Lists. In: SIGMOD (June 2004)
Bayer, R., Unterauer, K.: Prefix B-trees. ACM Trans. on Database Systems 2(1), 11–26 (1977)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shimizu, T., Yoshikawa, M. (2005). Full-Text and Structural XML Indexing on B + -Tree. In: Andersen, K.V., Debenham, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2005. Lecture Notes in Computer Science, vol 3588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11546924_44
Download citation
DOI: https://doi.org/10.1007/11546924_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28566-3
Online ISBN: 978-3-540-31729-6
eBook Packages: Computer ScienceComputer Science (R0)