Abstract
The reliability polynomial of a graph counts its connected subgraphs of various sizes. Algorithms based on sequential importance sampling (SIS) have been proposed to estimate a graph’s reliability polynomial. We develop an improved SIS algorithm for estimating the reliability polynomial. The new algorithm runs in expected time O(mlogn α(m,n)) at worst and ≈m in practice, compared to Θ(m 2) for the previous algorithm. We analyze the error bounds of this algorithm, including comparison to alternative estimation algorithms. In addition to the theoretical analysis, we discuss methods for estimating the variance and describe experimental results on a variety of random graphs.
Similar content being viewed by others
References
Beichl, I., Cloteaux, B., Sullivan, F.: An approximation algorithm for the coefficients of the reliability polynomial. Congr. Numer. 197, 143–151 (2010)
Chechik, S., Emek, Y., Patt-Shamir, B., Peleg, D.: Sparse reliable graph Backbones. In: ICALP, pp. 261–272 (2010)
Colbourn, C., Debroni, B., Myrvold, W.: Estimating the coefficients of the reliability polynomial. In: Proceedings of the Seventeenth Manitoba Conference on Numerical Mathematics and Computing, Congressum Numerantium, vol. 62, pp. 217–223 (1988)
Colbourn, C., Myrvold, W., Neufeld, E.: Two algorithms for unranking arborescences. J. Algorithms 20, 268–281 (1996)
Fishman, G.: A Monte Carlo sampling plan for estimating network reliability. Oper. Res. 34, 581–594 (1986)
Galil, Z., Italiano, G.: Fully dynamic algorithms for 2-edge connectivity. SIAM J. Comput. 21, 1047–1069 (1992)
Gautschi, W., Inglese, G.: Lower bounds for the condition number of Vandermonde matrices. Numer. Math. 52, 241–250 (1988)
Henzinger, M., King, V.: Fully dynamic 2-edge connectivity algorithm in polylogarithmic time per operation. SRC Technical Note (1997)
Karger, D.: A randomized fully polynomial time approximation scheme for the all terminal network reliability problem. SIAM J. Comput. 29, 11–17 (1996)
Myrvold, W.: Counting k-component forests of a graph. Networks 22, 647–652 (1992)
La Poutre, J.: Maintenance of 2- and 3- connected components of graphs, Part II: 2- and 3-edge-connected components and 2-vertex-connected components. Tech. Rep. RUU-CS-90-27, Utrecht University (1990)
Provan, J., Ball, M.: The complexity of counting cuts and the probability that a graph is connected. SIAM J. Comput. 12, 777–788 (1983)
Ramanathan, A., Colbourn, C.: Counting almost minimum cutsets with reliability applications. Math. Program. 39, 253–261 (1987)
Tarjan, R.: Efficiency of a good but not linear set union algorithm. J. ACM 22, 215–225 (1975)
Watts, D., Strogatz, S.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998)
Wilson, D.: Generating random spanning trees more quickly than the cover time. In: Proceedings of the Twenty-Eighth Annual ACM Symposium of Theory of Computing, pp. 296–303 (1996)
Acknowledgements
Thanks to Antonio Possolo, for helpful conversations about the measurements of the algorithms’ probability distributions, and for supporting Mr. Harris’s research at NIST. Thanks to Brian Cloteaux for helping to edit and revise this paper.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Radius-Two Search for Bridges
In addition to detecting degree-one and degree-two vertices, Algorithm 2 conducts, in some circumstances, a breath-first search around a modified vertex v to find any easy bridges. The key idea of the search is that we assume that, if the vertex v is connected to a radius-two vertex, then it is connected to the entire graph. We search for edges which disconnect v from the radius-two vertices.
To begin, we let V 1 be all vertices connected to v, and let V 2=V−V 1−v. A bridge can arise if there is a unique w∈V 1 that is connected to V 2. If furthermore this w is not connected to any vertex in V 1 and if w is connected by a single edge to v, then the edge 〈v,w〉 is a bridge. If, on the other hand, w has a unique edge e connecting it to V 2, then e is a bridge.
The following algorithm finds such bridges:
-
1.
Initially set w=∅
-
2.
For each x∈V 1:
-
3.
For each distinct vertex y connected to x:
-
4.
If y∈V 2 and w=∅, set w=x and continue with the next x.
-
5.
If y∈V 2 and w≠∅, we find no bridges. Halt.
-
4.
-
3.
-
6.
If w=∅, then the graph is disconnected. Return.
-
7.
If there is a single edge connecting v to w, it is a bridge. Contract it and return.
-
8.
If there is a single edge 〈w,u〉 with u∈V 2, then it is a bridge; contract it and return.
-
9.
No bridges have been found. Terminate.
This code requires a data structure which supports efficient enumeration of distinct vertices. A mere list of edges (which may be multi-edges) is not enough. With this data structure, this code runs in time \(O(V_{1}^{2})\). Because Algorithm 2 only conducts this search if the degree of v is bounded by a constant \(d_{\operatorname{max}}\), the total cost of these searches can be regarded as O(1). In practice, we did not implement this data structure; this abandons the guarantee on maximum running time but in practice is faster.
There is one further improvement we can make here. If in either step (8) or (9) there is a double edge e 1,e 2 such that simultaneously removing e 1 and e 2 disconnects the graph, then we can simplify the graph by contracting the two edges e 1 and e 2 and inserting a loop of weight w(e 1)+w(e 2).
Appendix B: Estimating Variance
In Sect. 6, we examined worst-case variance across all possible graphs of a given size. In this section, we consider variance from a different angle. Suppose we are given a graph G and use the algorithms we have described to estimate its reliability generator. How reliable are these estimates? In this section, we describe a method for estimating relative variance.
To begin, let us consider top-down algorithms such as Algorithm 0 or 1. In such algorithms, we may remove an edge e with probability P(e); if the edge is removed, then the corresponding estimate is multiplied by 1/P(e). P(e) is essentially the importance function attached to each edge. For Algorithm 0, P(e)=1/m for all edges; for Algorithm 1, P(e)=1/|D(G)| if e is a non-bridge, and P(e)=0 otherwise.
In such cases, the straight-forward way to estimate E[F(G)] and E[F(G)2] would be to take T samples F(G) and extract the sample mean and variance. Let \(\hat{\mu}\) and \(\hat{\sigma}^{2}\) be the sample estimates obtained in this way. We have that
so these estimates may be very inaccurate when σ is large. For the high-order coefficients of the reverse reliability generator, this makes it exponentially expensive to estimate μ,σ accurately. This is a very dangerous situation: when the estimates F(G) are very inaccurate, it is also very difficult to determine how inaccurate they are.
A better method of estimating accuracy is to estimate E[F(G)] and E[F(G)2] using importance sampling. Instead of choosing the edge e to remove with the indicated probability P(e), we use an alternative importance-sampling probability distribution
We emphasize that there are two distinct probability distributions in play here. P(e) is the probability distribution used by algorithm F itself; P′(e) is the probability distribution we are imposing in order to measure algorithm F.
This gives us the following SIS algorithm F (d) for estimating E[F(G)d]:
Algorithm
F (d)
-
1.
Set m 0=1
-
2.
For k=1,…,K
-
3.
Choose an edge e∈G with probability P′(e).
-
4.
Set \(m_{k} \leftarrow m_{k-1} \frac{1}{P(e)^{d-1} P'(e)}\). Set G←G−e.
-
3.
-
5.
Return the estimate
$$\hat{r}^{(k)} = \sum_{k=0}^{K} \frac{m_k x^k}{(k!)^d} $$
It is easy to verify that E[F (d)(G)]=E[F(G)d].
The importance function P′(e) ensures that algorithm F (1) estimates the high-order coefficient of E[F(G)] (the number of spanning trees of G) exactly (variance zero), although the estimates for the lower-order coefficients have higher variance. The estimate of the high-order coefficients of E[F(G)]2 also has greatly improved variance, although the low-order coefficients are worse. We believe it is much safer to have an error estimate which is accurate when the error is large.
The estimate of σ 2 provided by F (2) is typically the right order of magnitude, but for low-order coefficients it is not very precise. The sample variance is a better estimator for these low-order coefficients. To use the sample variance safely, we first estimate σ 2 using F (2) for all coefficients. For those coefficients for which \(\hat{\sigma}^{2}\) is sufficiently small, we replace our estimate of σ 2 with the sample variance.
Note that to use this method, we must compute the importance function P′(e) for all edges e∈G at every stage of the recursion. To describe how this is done, we begin by recalling the Kirchoff matrix-tree theorem used to count the number of spanning trees of a graph G. Let A G be the adjacency matrix of G, and let D be a diagonal matrix whose ith entry is the degree of vertex v i . The Kirchoff formula states that κ(G)=detL, where L=(D−A G )11 denotes the minor of D−A G obtained by removing the first row and column.
The SIS algorithm F (d) would appear to be very expensive if it computed κ naively from the Kirchoff formula. There are m iterations, in each of which we must κ(G−e) for each edge. Evaluating a determinant costs O(n 3) work and O(n 2) memory. In all, naive computations would cost O(m 2 n 3) work and O(n 2) memory.
Reference [4] describes a variety of algorithms to compute κ and to update this structure as edges of G are removed. We will discuss one variant of those algorithms. The main idea is to keep track of λ(e)=κ(G−e)/κ(G) for each edge e∈G. These are precomputed for the original graph G. Next, when we update G by removing an edge e, we must update λ to the new λ′. We will find the following notation convenient. If e=〈i,j〉 is any edge in G, we define the column vector δ e to be the vector which is +1 in coordinate i, is −1 in coordinate j, and is zero elsewhere. Observe that when edge e is removed from G, the matrix L changes by \(\delta_{e} \delta_{e}^{T}\):
Now let us examine how to update λ′:
As described in [4], this leads to the following technique for updating λ. Whenever we remove edge e from G, we use a sparse matrix inversion algorithm such as conjugate gradient to compute \(u = L^{-1}_{G} \delta_{e}\). We then update λ(d) for each edge d∈G by
In total, this technique allows to reduce the memory requirement to O(m) and the complexity to O(mn). Furthermore, in practice the conjugate gradient method needs far less than m n iterations to compute an adequate approximation, so in practice the work requirement per iteration is more like ≈mlogn. This is still much more expensive than computing F itself, however it allows us to achieve accurate estimates for E[F(G)] and E[F(G)2] on graphs of interest. In many cases, straightforward sampling of F itself would have required infeasible computation to obtain accurate results.
Rights and permissions
About this article
Cite this article
Harris, D.G., Sullivan, F. & Beichl, I. Fast Sequential Importance Sampling to Estimate the Graph Reliability Polynomial. Algorithmica 68, 916–939 (2014). https://doi.org/10.1007/s00453-012-9703-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-012-9703-x