skip to main content
research-article

Path-tree: An efficient reachability indexing scheme for large directed graphs

Published:18 March 2011Publication History
Skip Abstract Section

Abstract

Reachability query is one of the fundamental queries in graph database. The main idea behind answering reachability queries is to assign vertices with certain labels such that the reachability between any two vertices can be determined by the labeling information. Though several approaches have been proposed for building these reachability labels, it remains open issues on how to handle increasingly large number of vertices in real-world graphs, and how to find the best tradeoff among the labeling size, the query answering time, and the construction time. In this article, we introduce a novel graph structure, referred to as path-tree, to help labeling very large graphs. The path-tree cover is a spanning subgraph of G in a tree shape. We show path-tree can be generalized to chain-tree which theoretically can has smaller labeling cost. On top of path-tree and chain-tree index, we also introduce a new compression scheme which groups vertices with similar labels together to further reduce the labeling size. In addition, we also propose an efficient incremental update algorithm for dynamic index maintenance. Finally, we demonstrate both analytically and empirically the effectiveness and efficiency of our new approaches.

Skip Supplemental Material Section

Supplemental Material

References

  1. Adler, M. and Mitzenmacher, M. 2001. Towards compressing web graphs. In Proceedings of the Data Compression Conference. IEEE, 203--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Agrawal, R., Borgida, A., and Jagadish, H. 1989. Efficient management of transitive relationships in large data and knowledge bases. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 253--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bouros, P., Skiadopoulos, S., Dalamagas, T., Sacharidis, D., and Sellis, T. K. 2009. Evaluating reachability queries over path collections. In Proceedings of the International Conference on Statistical and Scientific Database Management (SSDBM). 398--416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chen, L., Gupta, A., and Kurul, M. 2005. Stack-based algorithms for pattern matching on dags. In Proceedings of the 31st International Conference on Very Large Data Bases. VLDB Endowment, 493--504. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Cheng, J., Yu, J. X., Lin, X., Wang, H., and Yu, P. S. 2006. Fast computation of reachability labeling for large graphs. In Proceedings of the International Conference on Extending Database Technology (EDBT). 961--979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chu, Y. J. and Liu, T. H. 1965. On the shortest arborescence of a directed graph. Sci. Sinica 14, 1396--1400.Google ScholarGoogle Scholar
  7. Cohen, E., Halperin, E., Kaplan, H., and Zwick, U. 2003. Reachability and distance queries via 2-hop labels. SIAM J. Comput. 32, 5, 1338--1355. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. 2001. Introduction to Algorithms. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. de Berg, M., Cheong, O., van Kreveld, M., and Overmars, M. 2008. Computational Geometry: Algorithms and Applications 3rd Ed. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Dilworth, R. P. 1950. A decomposition theorem for partially ordered sets. Ann. Math., 2nd Series 51, 1, 161--166.Google ScholarGoogle ScholarCross RefCross Ref
  11. Edmonds, J. 1967. Optimum branchings. J. Res. Natl. Bureau Stand. 71B, 233--240.Google ScholarGoogle ScholarCross RefCross Ref
  12. Gabow, H. N., Galil, Z., Spencer, T., and Tarjan, R. E. 1986. Efficient algorithms for finding minimum spanning trees in undirected and directed graphs. Combinatorica 6, 2, 109--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Goldberg, A. V., Tardos, E., and Tarjan, R. E. 1990. Network Flow Algorithms. Springer Verlag, 101--164.Google ScholarGoogle Scholar
  14. Jagadish, H. V. 1990. A compression technique to materialize transitive closure. ACM Trans. Datab. Syst. 15, 4, 558--598. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jin, R., Hong, H., Wang, H., Ruan, N., and Xiang, Y. 2010. Computing label-constraint reachability in graph databases. In Proceedings of the SIGMOD Conference. 123--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jin, R., Xiang, Y., Ruan, N., and Fuhry, D. 2009. 3-hop: a high-compression indexing scheme for reachability query. In Proceedings of the SIGMOD Conference. 813--826. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jin, R., Xiang, Y., Ruan, N., and Wang, H. 2008. Efficiently answering reachability queries on very large directed graphs. In Proceedings of the SIGMOD Conference. 595--608. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kameda, T. 1975. On the vector representation of the reachability in planar directed graphs* 1. Inform. Process. Lett. 3, 3, 75--77.Google ScholarGoogle ScholarCross RefCross Ref
  19. König, J. 1884. Über eine eigenschaft der potenzreihen. Math. Ann. 23, 447--449.Google ScholarGoogle ScholarCross RefCross Ref
  20. Navlakha, S., Rastogi, R., and Shrivastava, N. 2008. Graph summarization with bounded error. In Proceedings of the SIGMOD Conference. 419--432. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Raghavan, S. and Garcia-Molina, H. 2003. Representing web graphs. In Proceedings of the International Conference on Data Engineering (ICDE). 405--416.Google ScholarGoogle Scholar
  22. Schenkel, R., Theobald, A., and Weikum, G. 2004. Hopi: An efficient connection index for complex xml document collections. In Proceedings of the International Conference on Extending Database Technology (EDBT). 237--255.Google ScholarGoogle Scholar
  23. Simon, K. 1988. An improved algorithm for transitive closure on acyclic digraphs. Theor. Comput. Sci. 58, 1-3, 325--346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Trissl, S. and Leser, U. 2007. Fast and practical indexing and querying of very large graphs. In Proceedings of the SIGMOD Conference. 845--856. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Wang, H., He, H., Yang, J., Yu, P. S., and Yu, J. X. 2006. Dual labeling: Answering graph reachability queries in constant time. In Proceedings of the International Conference on Data Engineering (ICDE). 75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yildirim, H., Chaoji, V., and Zaki, M. J. 2010. Grail: scalable reachability index for large graphs. Proc. VLDB Endow. 3, 276--284. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Path-tree: An efficient reachability indexing scheme for large directed graphs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Database Systems
      ACM Transactions on Database Systems  Volume 36, Issue 1
      March 2011
      251 pages
      ISSN:0362-5915
      EISSN:1557-4644
      DOI:10.1145/1929934
      Issue’s Table of Contents

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 March 2011
      • Accepted: 1 September 2010
      • Revised: 1 April 2010
      • Received: 1 August 2009
      Published in tods Volume 36, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader