Abstract
The increasing popularity of XML has generated a lot of interest in query processing over graph-structured data. To support efficient evaluation of path expressions structured indexes have been proposed. However, most variants of structures indexes ignore inter- or intra-document references. They assume a tree-like structure of XML-documents. Extending these indexes to work with large XML graphs and to support intra-or inter-document links requires a lot of computing power for the creation process and a lot of space to store the indexes. Moreover, the efficient evaluation of ancestors-descendants queries over arbitrary graphs with long paths is a severe problem. In this paper, we propose a scalable connection index that is based on the concept of 2-hop covers as introduced by Cohen el al. The proposed algorithm for index creation scales down the original graph size substantially. As a result a directed acyclic graph with a smaller number of nodes and edges will emerge. This reduces the number of computing steps required for building the index. Thus, computing time and space will be reduced as well . The index also permits to efficiently evaluate ancestors-descendants relationships. Moreover, the proposed index has a nice property in comparison to most other work; it is optimized for descendants-or-self queries on arbitrary graphs with link relationships.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cooper, B.F., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A fast index for semistructured data. In: VLDB 2001 Proceedings of 27th International Conference on Very Large Data Bases, Roma, Italy, September 11-14. Morgan Kaufmann, San Francisco (2001)
Kaplan, H., Milo, T.: Short and simple labels for small distances and other functions. In: Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 2001. LNCS, vol. 2125, pp. 246–257. Springer, Heidelberg (2001)
Barashev, D., et al.: Indexing XML to Support Path Expressions. In: 6th East-European Conference on advances in Databases and Infromation System, ADBIS (2002)
Cohen, E., et al.: Labeling dynamic XML trees. In: Symposium on Principle of Databases (POSD ), pp. 271–281 (2002)
Qun, C., et al.: D(K)-Index: An adaptive Structural Summares for Graph-based Data. In: ACM SIGMOD Int. Conference on Mangement of Data, pp. 134–144 (2003)
Milo, T., Suciu, D.: Index Structures for path expressions. In: 7th International Conference on Database Theory (ICDT), pp. 277–295 (1999)
Cohen, Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proceedings Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 937–946. ACM Press, New York (2002)
Goldman, R., Widom, J.: DataGuides: Enabling query formulation and optimization in semistructured databases. In: Jarke, M., Carey, M.J., Dittrich, K.R., Lochovsky, F.H., Loucopoulos, P., Jeusfeld, M.A. (eds.) VLDB 1997, Proceedings of 23rd International Conference on Very Large Data Bases, Athens, Greece, August 25-29, 1997, pp. 436–445. Morgan Kaufmann, San Francisco (1997)
Kaushik, R., et al.: Covering indexes for Branching path queries. In: ACM SIGMOD int. Conference on Management of data, pp. 133–144 (2002)
Chung, C.-W., Min, J.-K., Shim, K.: APEX: An adaptive path index for XML data. In: Franklin, et al. (eds.) [6], pp. 121–132
Schenkel, R., et al.: HOPI: An efficient connection index for complex XML document collections. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 237–255. Springer, Heidelberg (2004)
Kaushik, R., et al.: Exploiting Local Similarity for Indexing Paths in Graph-Structured Data. In: 18th Int. Conference on Data Engineering, ICDE (2002)
Sayed, Unland, R.: Index-support on XML documents Containing Links. In: IEEE Midwest Symposium on Circuits and System (2003)
Kaplan, H., et al.: A Comparison of labeling schemes for ancestor queries. In: 13th ACM- SIAM Symposium on Discrete algorithms (SODA), pp. 954–963 (2002)
The Mondial Database, http://dbis.informatik.uni-goettingen.de/Mondial/
Abiteboul, S., et al.: Compact labeling schemes for ancestor’s queries. In: 12th ACM- SIAM Symposium on Discrete algorithms (SODA), pp. 547–556 (2001)
Cormen, T.H., et al.: Introduction to algorithms, 2nd edn., ch. 22-23 (2001)
Nuutila, E., Soisalon-Soininen: Efficient Transitive Closure Computation. Technical Report TKO-B113 (1993)
Li, Q., Moon, B.: Indexing and querying XML Data for Regular Path Expressions. In: 27th Int. Conference on Very Large Data Bases (VLDB), pp. 361–370 (2001)
Yoshikawa, M., Amagasa, T.: XRel: A Path-Index Based Approach to Storage and Retrivel XML Documents Using Relational Databases. ACM Transactions on Internet Technology, TOIT (2001)
Tatarinov, S.D., Zhang, C.: Storing and Querying Ordered XML Using a Relational Database System. In: ACM SIGMOD Int. Conference on Management of Data, pp. 204–215 (2002)
Zhang, C., Naughton, J.F., DeWitt, D.J., Luo, Q., Lohman, G.: On Supporting Containment Queries in Relational Database Mangement System. In: ACM SIGMOD Int. Conference on Management of Data (2001)
Chein, S.-Y., et al.: Efficient Structural Joins on Indexed XML Documents. In: 28th Int. Conference on Very Large Data Bases, VLDB (2002)
XML Linking Language(XLink) Version 1.0, W3C Recommendation (June 27, 2001), http://www.W3.org/TR/xlink
XML Pointer Language (XPointer), W3C Working Draft (August 16, 2002), http://www.w3.org/TR/xptr
The Internet Movie Databse, http://www.imdb.com
The XML bechmark project, http://www.xml-benchmark.org
Abiteboul, S., Bunmen, P., Suciu, D.: Data on the Web: from relations to semistructured data and XML. Morgan Kaufmann Publishers, Los Atlos (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sayed, A., Unland, R. (2005). HID: An Efficient Path Index for Complex XML Collections with Arbitrary Links. In: Bhalla, S. (eds) Databases in Networked Information Systems. DNIS 2005. Lecture Notes in Computer Science, vol 3433. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31970-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-31970-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25361-7
Online ISBN: 978-3-540-31970-2
eBook Packages: Computer ScienceComputer Science (R0)