ABSTRACT
Shortest-path query processing not only serves as a long established routine for numerous applications in the past but also is of increasing popularity to support novel graph applications in very large databases nowadays. For a large graph, there is the new scenario to query intensively against arbitrary nodes, asking to quickly return node distance or even shortest paths. And traditional main memory algorithms and shortest paths materialization become inadequate. We are interested in graph labelings to encode the underlying graphs and assign labels to nodes to support efficient query processing. Surprisingly, the existing work of this category mainly emphasizes on reachability query processing, while no sufficient effort has been given to distance labelings to support querying exact shortest distances between nodes. Distance labelings must be developed on the graph in whole to correctly retain node distance information. It makes many existing methods to be inapplicable. We focus on fast computing distance-aware 2-hop covers, which can encode the all-pairs shortest paths of a graph in O(|V|·|E|1/2) space. Our approach exploits strongly connected components collapsing and graph partitioning to gain speed, while it can overcome the challenges in correctly retaining node distance information and appropriately encoding all-pairs shortest paths with small overhead. Furthermore, our approach avoids pre-computing all-pairs shortest paths, which can be prohibitive over large graphs. We conducted extensive performance studies, and confirm the efficiency of our proposed new approaches.
- R. Agrawal, A. Borgida, and H. V. Jagadish. Efficient management of transitive relationships in large data and knowledge bases. In Proc. of SIGMOD'89, 1989. Google ScholarDigital Library
- R. Agrawal and H. V. Jagadish. Algorithms for searching massive graphs. IEEE Trans. on Knowl. and Data Eng., 06(2), 1994. Google ScholarDigital Library
- L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership, growth, and evolution. In Proc. of KDD '06, 2006. Google ScholarDigital Library
- R. Bellman. On a routing problem. Quarterly of Applied Mathematics, 16(1):87--90, 1958.Google ScholarCross Ref
- R. Bramandia, B. Choi, and W. K. Ng. On incremental maintenance of 2-hop labeling of graphs. In Proc. of WWW '08, 2008. Google ScholarDigital Library
- L. Chen, A. Gupta, and M. E. Kurul. Stack-based algorithms for pattern matching on dags. In Proc. of VLDB'05, 2005. Google ScholarDigital Library
- J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computation of reachability labeling for large graphs. In Proc. of EDBT'06, 2006. Google ScholarDigital Library
- J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computing reachability labelings for large graphs with high compression rate. In Proc. of EDBT '08, 2008. Google ScholarDigital Library
- E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. In Proc. of SODA'02, 2002. Google ScholarDigital Library
- T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algorithms. MIT Press, 2001. Google ScholarDigital Library
- F. Dabek, R. Cox, F. Kaashoek, and R. Morris. Predicting internet network distance with coordinates-based approaches. In Proc. of SIGCOMM '04, 2004.Google Scholar
- E. W. Dijkstra. A note on two problems in connection with graphs. Numerische Math., 1:269--271, 1959.Google ScholarDigital Library
- R. W. Floyd. Shortest path. Communications of the ACM, 5:345, 1962. Google ScholarDigital Library
- C. Gavoille, D. Peleg, S. Pérennes, and R. Raz. Distance labeling in graphs. J. Algorithms, 53(1):85--112, 2004. Google ScholarDigital Library
- A. V. Goldberg and R. F. Werneck. Computing point-to-point shortest paths from external memory. In Proc. of ALENEX '05, 2005.Google Scholar
- A. V. Goldberg and R. F. Werneck. Reach for a*: Efficient point-to-point shortest path algorithms. In Proc. of ALENEX '06, 2006.Google ScholarCross Ref
- G. Gou and R. Chirkova. Efficient algorithms for exact ranked twig-pattern matching over graphs. In Proc. of SIGMOD '08, 2008. Google ScholarDigital Library
- H. He, H. Wang, J. Yang, and P. S. Yu. Blinks: ranked keyword searches on graphs. In Proc. of SIGMOD '07, 2007. Google ScholarDigital Library
- H. V. Jagadish. A compression technique to materialize transitive closure. ACM Trans. Database Syst., 15(4):558--598, 1990. Google ScholarDigital Library
- R. Jin, Y. Xiang, N. Ruan, and H. Wang. Efficiently answering reachability queries on very large directed graphs. In Proc. of SIGMOD '08, 2008. Google ScholarDigital Library
- N. Jing, Y.-W. Huang, and E. A. Rundensteiner. Hierarchical encoded path views for path query processing: An optimal model and its performance evaluation. IEEE Trans. on Knowl. and Data Eng., 10(3), 1998. Google ScholarDigital Library
- D. B. Johnson. Finding all the elementary circuits of a directed graph. SIAM J. Comput., 4(1):77--84, 1975.Google ScholarDigital Library
- D. S. Johnson. Approximation algorithms for combinatorial problems. In Proc. of STOC'73, 1973. Google ScholarDigital Library
- R. Johnsonbaugh and M. Kalin. A graph generation software package. In Prof. of SIGCSE'91, 1991. Google ScholarDigital Library
- S. Jung and S. Pramanik. An efficient path computation model for hierarchically structured topographical road maps. IEEE Trans. on Knowl. and Data Eng., 14(5), 2002. Google ScholarDigital Library
- D. Kempe, J. Kleinberg, and Éva Tardos. Maximizing the spread of influence through a social network. In Proc. of KDD '03, 2003. Google ScholarDigital Library
- I. M. Keseler, J. Collado-Vides, S. Gama-Castro, J. Ingraham, S. Paley, I. T. Paulsen, M. Peralta-Gil, and P. D. Karp. Ecocyc: a comprehensive database resource for escherichia coli. Nucleic Acids Research, 33(D334-7), 2005.Google Scholar
- D. E. Knuth. The Stanford GraphBase: a platform for combinatorial computing. ACM Press, 1993. Google Scholar
- T. S. E. Ng and H. Zhang. Predicting internet network distance with coordiantes-based approaches. In Proc. of INFOCOM '01, 2001.Google Scholar
- D. Peleg. Proximity-preserving labeling schemes. J. Graph Theory, 33:167--176, 2000. Google ScholarDigital Library
- S. Pettie. On the shortest path and minimum spanning tree problems. PH.D Dissertation, The University of Texas at Austin, 2003. Google ScholarDigital Library
- S. A. Rahman, P. Advani, R. Schunk, R. Schrader, and D. Schomburg. Metabolic pathway analysis web service (Pathway Hunter Tool at CUBIC). Bioinformatics, 21(7):1189--1193. Google ScholarDigital Library
- R. Schenkel, A. Theobald, and G. Weikum. Hopi: An efficient connection index for complex XML document collections. In Proc. of EDBT'04, 2004.Google ScholarCross Ref
- R. Schenkel, A. Theobald, and G. Weikum. Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In Proc. of ICDE'05, 2005. Google ScholarDigital Library
- A. Schmidt, F. Waas, M. Kersten, M. J. Carey, I. Manolescu, and R. Busse. Xmark: A benchmark for xml data management. In Proc. of VLDB'02, 2002. Google ScholarDigital Library
- R. E. Tarjan. Enumeration of the elementary circuits of a directed graph. SIAM J. Comput., 2(3):211--216, 1973.Google ScholarDigital Library
- M. Thorup and U. Zwick. Approximate distance oracles. In Proc. of STOC '01, 2001. Google ScholarDigital Library
- S. TrißI and U. Leser. Fast and practical indexing and querying of very large graphs. In Proc. of SIGMOD '07, 2007. Google ScholarDigital Library
- H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In Proc. of ICDE'06, 2006. Google ScholarDigital Library
- S. Wasserman and K. Faust. Social Network Analysis. Cambridge University Press, 1994.Google ScholarCross Ref
- R. Yuster and U. Zwick. Answering distance queries in directed graphs using fast matrix multiplication. In Proc. of FOCS '05, 2005. Google ScholarDigital Library
- U. Zwick. All pairs shortest paths using bridging sets and rectangular matrix multiplication. J. ACM, 49(3):289--317, 2002. Google ScholarDigital Library
Recommendations
The exact distance to destination in undirected world
Shortest distance queries are essential not only in graph analysis and graph mining tasks but also in database applications, when a large graph needs to be dealt with. Such shortest distance queries are frequently issued by end-users or requested as a ...
View-based query processing: On the relationship between rewriting, answering and losslessness
As a result of the extensive research in view-based query processing, three notions have been identified as fundamental, namely rewriting, answering, and losslessness. Answering amounts to computing the tuples satisfying the query in all databases ...
Comments