skip to main content
10.1145/1376616.1376677acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Efficiently answering reachability queries on very large directed graphs

Published:09 June 2008Publication History

ABSTRACT

Efficiently processing queries against very large graphs is an important research topic largely driven by emerging real world applications, as diverse as XML databases, GIS, web mining, social network analysis, ontologies, and bioinformatics. In particular, graph reachability has attracted a lot of research attention as reachability queries are not only common on graph databases, but they also serve as fundamental operations for many other graph queries. The main idea behind answering reachability queries in graphs is to build indices based on reachability labels. Essentially, each vertex in the graph is assigned with certain labels such that the reachability between any two vertices can be determined by their labels. Several approaches have been proposed for building these reachability labels; among them are interval labeling (tree cover) and 2-hop labeling. However, due to the large number of vertices in many real world graphs (some graphs can easily contain millions of vertices), the computational cost and (index) size of the labels using existing methods would prove too expensive to be practical. In this paper, we introduce a novel graph structure, referred to as path-tree, to help labeling very large graphs. The path-tree cover is a spanning subgraph of G in a tree shape. We demonstrate both analytically and empirically the effectiveness of our new approaches.

References

  1. R. Agrawal, A. Borgida, and H. V. Jagadish. Efficient management of transitive relationships in large data and knowledge bases. In SIGMOD, pages 253--262, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Li Chen, Amarnath Gupta, and M. Erdem Kurul. Stack-based algorithms for pattern matching on dags. In VLDB '05: Proceedings of the 31st international conference on Very large data bases, pages 493--504, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jiefeng Cheng, Jeffrey Xu Yu, Xuemin Lin, Haixun Wang, and Philip S. Yu. Fast computation of reachability labeling for large graphs. In EDBT, pages 961--979, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Y. J. Chu and T. H. Liu. On the shortest arborescence of a directed graph. Science Sinica, 14:1396--1400, 1965.Google ScholarGoogle Scholar
  5. Edith Cohen, Eran Halperin, Haim Kaplan, and Uri Zwick. Reachability and distance queries via 2-hop labels. In Proceedings of the 13th annual ACM-SIAM Symposium on Discrete algorithms, pages 937--946, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Thomas H. Cormen, Charles E. Leiserson, and Ronald L. Rivest. Introduction to Algorithms. McGraw Hill, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Mark de Berg, M. van Krefeld, M. Overmars, and O. Schwarzkopf. Computational Geometry: Algorithms and Applications. Springer-Verlag, second edition, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Edmonds. Optimum branchings. J. Research of the National Bureau of Standards, 71B:233--240, 1967.Google ScholarGoogle ScholarCross RefCross Ref
  9. H N Gabow, Z Galil, T Spencer, and R E Tarjan. Efficient algorithms for finding minimum spanning trees in undirected and directed graphs. Combinatorica, 6(2):109--122, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. V. Goldberg, E. Tardos, and R. E. Tarjan. Network Flow Algorithms, pages 101--164. Springer Verlag, 1990.Google ScholarGoogle Scholar
  11. H. V. Jagadish. A compression technique to materialize transitive closure. ACM Trans. Database Syst., 15(4):558--598, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Kameda. On the vector representation of the reachability in planar directed graphs. Information Processing Letters, 3(3), January 1975.Google ScholarGoogle ScholarCross RefCross Ref
  13. R. Schenkel, A. Theobald, and G. Weikum. HOPI: An efficient connection index for complex XML document collections. In EDBT, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  14. K. Simon. An improved algorithm for transitive closure on acyclic digraphs. Theor. Comput. Sci., 58(1-3):325--346, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Silke Trißl and Ulf Leser. Fast and practical indexing and querying of very large graphs. In SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pages 845--856, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Haixun Wang, Hao He, Jun Yang, Philip S. Yu, and Jeffrey Xu Yu. Dual labeling: Answering graph reachability queries in constant time. In ICDE '06: Proceedings of the 22nd International Conference on Data Engineering (ICDE'06), page 75, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Efficiently answering reachability queries on very large directed graphs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data
      June 2008
      1396 pages
      ISBN:9781605581026
      DOI:10.1145/1376616

      Copyright © 2008 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 June 2008

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader