Abstract
Reachability query is a fundamental problem in graph databases. It answers whether or not there exists a path between a source vertex and a destination vertex and is widely used in various applications including road networks, social networks, world wide web and bioinformatics. In some emerging important applications, uncertainties may be inherent in the graphs. For instance, each edge in a graph could be associated with a probability to appear. In this paper, we study the reachability problem over such uncertain graphs in a threshold fashion, namely, to determine if a source vertex could reach a destination vertex with probabilty larger than a user specified probability value t. Finding reachability on uncertain graphs has been proved to be NP-Hard. We first propose novel and effective bounding techniques to obtain the upper bound of reachability probability between the source and destination. If the upper bound fails to prune the query, efficient dynamic Monte Carlo simulation technqiues will be applied to answer the probabilitistic reachability query with an accuracy guarantee. Extensive experiments over real and synthetic datasets are conducted to demonstrate the efficiency and effectiveness of our techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: SIGMOD, pp. 253–262 (1989)
Chen, L., Gupta, A., Kurul, M.E.: Stack-based algorithms for pattern matching on dags. In: VLDB, pp. 493–504 (2005)
Cheng, J., Yu, J.X., Lin, X., Wang, H., Yu, P.S.: Fast computation of reachability labeling for large graphs. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 961–979. Springer, Heidelberg (2006)
Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 937–946 (2002)
Jagadish, H.V.: A compression technique to materialize transitive closure. ACM Trans. Database Syst. 15(4), 558–598 (1990)
Schenkel, R., Theobald, A., Weikum, G.: HOPI: An efficient connection index for complex XML document collections. In: Hwang, J., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 237–255. Springer, Heidelberg (2004)
Simon, K.: An improved algorithm for transitive closure on acyclic digraphs. Theor. Comput. Sci. 58(1-3), 325–346 (1988)
Tribl, S., Leser, U.: Fast and practical indexing and querying of very large graphs. In: SIGMOD 2007: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, pp. 845–846 (2007)
Wang, H., He, H., Yang, J., Yu, P.S., Yu, J.X.: Dual labeling: Answering graph reachability queries in constant time. In: ICDE, p. 75 (2006)
Valiant, L.G.: The complexity of enumeration and reliability problems. SIAM J. Compt. 8, 410–421 (1979)
Jiang, B., Pei, J., Lin, X., Cheung, D.W., Han, J.: Mining preferences from superior and inferior examples. In: KDD, pp. 390–398 (2008)
Provan, J.S., Ball, M.O.: Computing Network Reliability in Time Polynomial in the Number of Cuts. Operations Research, Reliability and Maintainability 32(3), 516–526 (1984)
Shier, D.R., Liu, N.: Bounding the Reliability of Networks. The Journal of the Operational Research Society, Mathematical Programming in Honour of Ailsa Land 43(5), 539–548 (1992)
Jin, R., Xiang, Y., Ruan, N., Wang, H.: Efficiently Answering Reachability Queries on Very Large Directed Graphs. In: SIGMOD (2008)
Easton, M.C., Wong, C.K.: Sequential Destruction Method for Monte Carlo Evaluation of System Reliability. IEEE, Reliability 29, 191–209 (1980)
Fishman, G.S.: A Monte Carlo Sampling Plan for Estimating Network Reliability. Operational Research 34(4), 581–594 (1986)
Karp, R., Luby, M.G.: A New Monte Carlo Method for Estimating the Failure Probability of An N-component System. In: Computer Science Division. University of Carlifornia, Berkley (1983)
Okamoto, M.: Some Inequalities Relating To the Partial Sum of Binomial Probabilities. Annals Inst. Statistical Mathematics 10, 29–35 (1958)
Fishman, G.S.: A Comparison of Four Monte Carlo Methods for Estimating the Probability of s-t Connectedness. IEEE, Trans. Reliability 35(2) (1986)
Chan, E.P., Lim, H.: Optimization and Evaluation of Shortest Path Queries. VLDB Journal 16(3), 343–369 (2007)
Meester, R.: A Natural Introduction to Probability Theory (2004)
Adar, E., Ré, C.: Managing Uncertainty in Social Networks. Data Engineering Bulletin 30(2), 23–31 (2007)
Zou, Z., Gao, H., Li, J.: Discovering Frequent Subgraphs over Uncertain Graph Databases under Probablistic Semantics. In: KDD (2010)
Zou, Z., Li, J., Gao, H., Zhang, S.: Mining Frequent Subgraph Patterns from Uncertain Graph Data. TKDE 22(9), 1203–1218 (2010)
Zou, Z., Gao, H., Li, J.: Discovering Frequent Subgraphs over Uncertain Graph Databases under Probabilistic Semantics. In: SIGKDD, pp. 633–642 (2010)
Zou, Z., Li, J., Gao, H., Zhang, S.: Finding Top-k Maximal Cliques in an Uncertain Graph. In: ICDE, pp. 649–652 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhu, K., Zhang, W., Zhu, G., Zhang, Y., Lin, X. (2011). BMC: An Efficient Method to Evaluate Probabilistic Reachability Queries. In: Yu, J.X., Kim, M.H., Unland, R. (eds) Database Systems for Advanced Applications. DASFAA 2011. Lecture Notes in Computer Science, vol 6587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20149-3_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-20149-3_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20148-6
Online ISBN: 978-3-642-20149-3
eBook Packages: Computer ScienceComputer Science (R0)