Abstract
Uncertain graph has been widely used to represent graph data with inherent uncertainty in structures. Reliability search is a fundamental problem in uncertain graph analytics. This paper investigates on a new problem with broad real-world applications, the top-k reliability search problem on uncertain graphs, that is, finding the k vertices v with the highest reliabilities of connections from a source vertex s to v. Note that the existing algorithm for the threshold-based reliability search problem is inefficient for the top-k reliability search problem. We propose a new algorithm to efficiently solve the top-k reliability search problem. The algorithm adopts two important techniques, namely the BFS sharing technique and the offline sampling technique. The BFS sharing technique exploits overlaps among different sampled possible worlds of the input uncertain graph and performs a single BFS on all possible worlds simultaneously. The offline sampling technique samples possible worlds offline and stores them using a compact structure. The algorithm also takes advantages of bit vectors and bitwise operations to improve efficiency. In addition, we generalize the top-k reliability search problem from single-source case to the multi-source case and show that the multi-source case of the problem can be equivalently converted to the single-source case of the problem. Moreover, we define two types of the reverse top-k reliability search problems with different semantics on uncertain graphs. We propose appropriate solutions for both of them. Extensive experiments carried out on both real and synthetic datasets verify that the optimized algorithm outperforms the baselines by 1–2 orders of magnitude in execution time while achieving comparable accuracy. Meanwhile, the optimized algorithm exhibits linear scalability with respect to the size of the input uncertain graph.
Similar content being viewed by others
References
Adar E, Ré C (2007) Managing uncertainty in social networks. IEEE Data Eng Bull 30(2):15–22
Aggarwal CC (2010) Managing and mining uncertain data, vol 35. Springer, Berlin
Aggarwal K, Misra K, Gupta J (1975) Reliability evaluation a comparative study of different techniques. Microelectron Reliab 14(1):49–56
Aggarwal K, Rai S (1981) Reliability evaluation in computer-communication networks. IEEE Trans Reliab 1:32–35
Asthana S, King OD, Gibbons FD, Roth FP (2004) Predicting protein complex membership using probabilistic network reliability. Genome Res 14(6):1170–1175
Bader DA, Madduri K (2006) Gtgraph: a synthetic graph generator suite. http://www.cse.psu.edu/~kxm85/software/GTgraph/gen
Condie T, Conway N, Alvaro P, Hellerstein JM , Gerth J, Talbot J, Elmeleegy K, Sears R (2010) Online aggregation and continuous query support in mapreduce. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD’10), pp 1115–1118
Jin R, Liu L, Aggarwal C C (2011) Discovering highly reliable subgraphs in uncertain graphs. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’11), pp 992–1000
Jin R, Liu L, Ding B, Wang H (2011) Distance-constraint reachability computation in uncertain graphs. Proc VLDB Endow (PVLDB) 4(9):551–562
Khan A, Bonchi F, Gionis A, Gullo F (2014) Fast reliability search in uncertain graphs. In: Proceedings of the 17th international conference on extending database technology (EDBT’14), pp 535–546
Li RH, Yu JX, Mao R, Jin T (2014) Efficient and accurate query evaluation on uncertain graphs via recursive stratified sampling. In: Proceedings of the IEEE 30th international conference on data engineering (ICDE’14), pp 892–903
Liu L, Jin R, Aggarwal C, Shen Y (2012) Reliable clustering on uncertain graphs.In: Proceedings of the IEEE 12th international conference on data mining (ICDM’12), pp 459–468
Potamias M, Bonchi F, Gionis A, Kollios G (2012) K-nearest neighbors in uncertain graphs. Proc VLDB Endow 3(1–2):997–1008
Schmidt JP, Siegel A, Srinivasan A (1995) Chernoff–Hoeffding bounds for applications with limited independence. SIAM J Discrete Math 8(2):223–250
Sevon P, Eronen L, Hintsanen P, Kulovesi K, Toivonen H (2006) Link discovery in graphs derived from biological databases. In: Leser U, Naumann F, Eckman B (eds) Data integration in the life sciences. Springer, Berlin
Valiant LG (1979) The complexity of enumeration and reliability problems. SIAM J Comput 8(3):410–421
WepiwÉ G, Simeonov PL (2006) Hipeer: a highly reliable P2P system. IEICE Trans Inf Syst 89(2):570–580
Yu AW, Mamoulis N, Su H (2014) Reverse top-k search using random walk with restart. Proc VLDB Endow (PVLDB) 7(5):401–412
Yuan Y, Chen L, Wang G (2010) Efficiently answering probability threshold-based shortest path queries over uncertain graphs. In: Proceedings of the 15th database systems for advanced applications (DASFAA’10), pp 155–170
Yuan Y, Wang G, Wang H, Chen L (2011) Efficient subgraph search over large uncertain graphs. Proc VLDB Endow (PVLDB) 4(11):876–886
Zhao B, Wang J, Li M, Wu F, Pan Y (2014) Detecting protein complexes based on uncertain graph model. IEEE/ACM Trans Comput Biol Bioinf 11(3):486–497
Zhu R, Zou Z, Li J (2015) Top-k reliability search on uncertain graphs. In: Proceedings of the 15th IEEE international conference on data mining (ICDM’15), pp 659–668
Zou Z, Li J, Gao H, Zhang S (2009) Frequent subgraph pattern mining on uncertain graph data. In: Proceedings of the 18th ACM conference on information and knowledge management (CIKM’09), pp 583–592
Zou Z, Li J, Gao H, Zhang S (2010) Finding top-k maximal cliques in an uncertain graph. In: Proceedings of the 26th IEEE international conference on data engineering (ICDE’10), pp 649–652
Zou Z, Li J, Gao H, Zhang S (2010) Mining frequent subgraph patterns from uncertain graph data. IEEE Trans Knowl Data Eng 22(9):1203–1218
Acknowledgments
This work was partially supported by the 973 Program of China (No. 2012CB036202), the National Natural Science Foundation of China (Nos. 61532015 and 61173023) and the HIT-Tencent Open Research Fund.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhu, R., Zou, Z. & Li, J. Towards efficient top-k reliability search on uncertain graphs. Knowl Inf Syst 50, 723–750 (2017). https://doi.org/10.1007/s10115-016-0961-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-016-0961-9