Abstract
Graph has great expressive power to describe the complex relationships among data objects, and there are large graph datasets available. In this paper, we focus ourselves on processing a primitive graph query. We call it reachability query. The reachability query, denoted \(A \rightsquigarrow D\), is to find all elements of a type D that are reachable from some elements in another type A. The problem is challenging because the existing structural join algorithms, studied in XML query processing, cannot be directly applied to it, because those techniques make use of the tree-structure heavily. We propose a novel approach which can process reachability queries on the fly while keeping the space consumption small that is needed to keep the required information for processing reachability queries. In brief, our approach is based on 2-hop labeling for a directed graph G which consumes O(|V|log|E|) space. We construct a novel join-index which is built on a small table and B+-tree. With the join-index, the high efficiency is achieved. We conducted extensive experimental studies, and we confirm that our approach can efficiently process reachability queries over a graph or a tree.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web: from relations to semistructured data and XML. Morgan Kaufmann Publishers Inc, San Francisco (2000)
Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: Proc. of SIGMOD 1989 (1989)
Al-Khalifa, S., Jagadish, H.V., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural joins: A primitive for efficient xml query pattern matching. In: Proceedings of the 18th International Conference on Data Engineering (ICDE 2002), p. 141. IEEE Computer Society Press, Los Alamitos (2002)
Berendt, B., Spiliopoulou, M.: Analysis of navigation behaviour in web sites integrating multiple information systems. The VLDB Journal 9(1), 56–75 (2000)
Cheng, J., Yu, J.X., Lin, X., Wang, H., Yu, P.S.: Fast computation of reachability labeling for large graphs (submitted for publication, 2005)
Chien, S.-Y., Vagena, Z., Zhang, D., Tsotras, V.J., Zaniolo, C.: Efficient structural joins on indexed xml documents. In: Bressan, S., Chaudhri, A.B., Li Lee, M., Yu, J.X., Lacroix, Z. (eds.) CAiSE 2002 and VLDB 2002. LNCS, vol. 2590, pp. 263–274. Springer, Heidelberg (2003)
Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proc. of SODA 2002 (2002)
DeRose, S., Maler, E., Orchard, D.: XML linking language (XLink) version 1.0 (2001), http://www.w3.org/TR/xlink
DeRose, S., Maler, E., Orchard, D.: XML pointer language (XPointer) version 1.0 (2001), http://www.w3.org/TR/xptr
Jiang, H., Lu, H., Wang, W., Ooi, B.: Xr-tree: Indexing xml data for efficient structural join. In: Proceedings of the 19th International Conference on Data Engineering (ICDE 2003), IEEE Computer Society, Los Alamitos (2003)
Keseler, I., Collado-Vides, J., Gama-Castro, S., Ingraham, J., Paley, S., Paulsen, I., Peralta-Gil, M., Karp, P.: Ecocyc: A omprehensive database resource for escherichia coli. Nucleic Acids Research 33, D334–D337 (2005)
Li, H., Lee, M.L., Hsu, W., Chen, C.: An evaluation of xml indexes for structural join. SIGMOD Rec. 33(3), 28–33 (2004)
Romero, P., Wagg, J., Green, M.L., Kaiser, D., Krummenacker, M., Karp, P.D.: Computational prediction of human metabolic pathways from the complete human genome. Genome Biology 6(1), 1–17 (2004)
Schenkel, R., Theobald, A., Weikum, G.: Hopi: An efficient connection index for complex xml document collections. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, Springer, Heidelberg (2004)
Schenkel, R., Theobald, A., Weikum, G.: Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In: Proc. of ICDE 2005 (2005)
Schmidt, A., Waas, F., Kersten, M., Carey, M.J., Manolescu, I., Busse, R.: Xmark: A benchmark for xml data management. In: Bressan, S., Chaudhri, A.B., Li Lee, M., Yu, J.X., Lacroix, Z. (eds.) CAiSE 2002 and VLDB 2002. LNCS, vol. 2590, Springer, Heidelberg (2003)
Wang, H., Wang, W., Lin, X., Li, J.: Labeling scheme and structural joins for graph-structured xml data. In: Proc. of The 7th Asia Pacific Web Conference (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cheng, J., Yu, J.X., Tang, N. (2006). Fast Reachability Query Processing. In: Li Lee, M., Tan, KL., Wuwongse, V. (eds) Database Systems for Advanced Applications. DASFAA 2006. Lecture Notes in Computer Science, vol 3882. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11733836_47
Download citation
DOI: https://doi.org/10.1007/11733836_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33337-1
Online ISBN: 978-3-540-33338-8
eBook Packages: Computer ScienceComputer Science (R0)