Fast Sequential Importance Sampling to Estimate the Graph Reliability Polynomial

Harris, David G.; Sullivan, Francis; Beichl, Isabel

doi:10.1007/s00453-012-9703-x

Fast Sequential Importance Sampling to Estimate the Graph Reliability Polynomial

Published: 08 November 2012

Volume 68, pages 916–939, (2014)
Cite this article

Algorithmica Aims and scope Submit manuscript

David G. Harris¹,
Francis Sullivan² &
Isabel Beichl³

408 Accesses
3 Citations
Explore all metrics

Abstract

The reliability polynomial of a graph counts its connected subgraphs of various sizes. Algorithms based on sequential importance sampling (SIS) have been proposed to estimate a graph’s reliability polynomial. We develop an improved SIS algorithm for estimating the reliability polynomial. The new algorithm runs in expected time O(mlogn α(m,n)) at worst and ≈m in practice, compared to Θ(m ²) for the previous algorithm. We analyze the error bounds of this algorithm, including comparison to alternative estimation algorithms. In addition to the theoretical analysis, we discuss methods for estimating the variance and describe experimental results on a variety of random graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sequential Importance Sampling for Estimating the Number of Perfect Matchings in Bipartite Graphs: An Ongoing Conversation with Laci

Sequential Monte Carlo for counting vertex covers in general graphs

Article 03 February 2015

Scalable and exact sampling method for probabilistic generative graph models

Article 21 April 2018

References

Beichl, I., Cloteaux, B., Sullivan, F.: An approximation algorithm for the coefficients of the reliability polynomial. Congr. Numer. 197, 143–151 (2010)
MathSciNet Google Scholar
Chechik, S., Emek, Y., Patt-Shamir, B., Peleg, D.: Sparse reliable graph Backbones. In: ICALP, pp. 261–272 (2010)
Colbourn, C., Debroni, B., Myrvold, W.: Estimating the coefficients of the reliability polynomial. In: Proceedings of the Seventeenth Manitoba Conference on Numerical Mathematics and Computing, Congressum Numerantium, vol. 62, pp. 217–223 (1988)
Google Scholar
Colbourn, C., Myrvold, W., Neufeld, E.: Two algorithms for unranking arborescences. J. Algorithms 20, 268–281 (1996)
Article MATH MathSciNet Google Scholar
Fishman, G.: A Monte Carlo sampling plan for estimating network reliability. Oper. Res. 34, 581–594 (1986)
Article MATH MathSciNet Google Scholar
Galil, Z., Italiano, G.: Fully dynamic algorithms for 2-edge connectivity. SIAM J. Comput. 21, 1047–1069 (1992)
Article MATH MathSciNet Google Scholar
Gautschi, W., Inglese, G.: Lower bounds for the condition number of Vandermonde matrices. Numer. Math. 52, 241–250 (1988)
Article MATH MathSciNet Google Scholar
Henzinger, M., King, V.: Fully dynamic 2-edge connectivity algorithm in polylogarithmic time per operation. SRC Technical Note (1997)
Karger, D.: A randomized fully polynomial time approximation scheme for the all terminal network reliability problem. SIAM J. Comput. 29, 11–17 (1996)
MathSciNet Google Scholar
Myrvold, W.: Counting k-component forests of a graph. Networks 22, 647–652 (1992)
Article MATH MathSciNet Google Scholar
La Poutre, J.: Maintenance of 2- and 3- connected components of graphs, Part II: 2- and 3-edge-connected components and 2-vertex-connected components. Tech. Rep. RUU-CS-90-27, Utrecht University (1990)
Provan, J., Ball, M.: The complexity of counting cuts and the probability that a graph is connected. SIAM J. Comput. 12, 777–788 (1983)
Article MATH MathSciNet Google Scholar
Ramanathan, A., Colbourn, C.: Counting almost minimum cutsets with reliability applications. Math. Program. 39, 253–261 (1987)
Article MATH MathSciNet Google Scholar
Tarjan, R.: Efficiency of a good but not linear set union algorithm. J. ACM 22, 215–225 (1975)
Article MATH MathSciNet Google Scholar
Watts, D., Strogatz, S.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998)
Article Google Scholar
Wilson, D.: Generating random spanning trees more quickly than the cover time. In: Proceedings of the Twenty-Eighth Annual ACM Symposium of Theory of Computing, pp. 296–303 (1996)
Chapter Google Scholar

Download references

Acknowledgements

Thanks to Antonio Possolo, for helpful conversations about the measurements of the algorithms’ probability distributions, and for supporting Mr. Harris’s research at NIST. Thanks to Brian Cloteaux for helping to edit and revise this paper.

Author information

Authors and Affiliations

United States Department of Defense, Washington, DC, USA
David G. Harris
IDA/Center for Computing Sciences, Bowie, MD, USA
Francis Sullivan
National Institute of Standards and Technology, Gaithersburg, MD, USA
Isabel Beichl

Authors

David G. Harris
View author publications
You can also search for this author in PubMed Google Scholar
Francis Sullivan
View author publications
You can also search for this author in PubMed Google Scholar
Isabel Beichl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David G. Harris.

Appendices

Appendix A: Radius-Two Search for Bridges

In addition to detecting degree-one and degree-two vertices, Algorithm 2 conducts, in some circumstances, a breath-first search around a modified vertex v to find any easy bridges. The key idea of the search is that we assume that, if the vertex v is connected to a radius-two vertex, then it is connected to the entire graph. We search for edges which disconnect v from the radius-two vertices.

To begin, we let V ₁ be all vertices connected to v, and let V ₂=V−V ₁−v. A bridge can arise if there is a unique w∈V ₁ that is connected to V ₂. If furthermore this w is not connected to any vertex in V ₁ and if w is connected by a single edge to v, then the edge 〈v,w〉 is a bridge. If, on the other hand, w has a unique edge e connecting it to V ₂, then e is a bridge.

The following algorithm finds such bridges:

1.
Initially set w=∅
2.
For each x∈V ₁:
1. 3.
  For each distinct vertex y connected to x:
  1. 4.
    If y∈V ₂ and w=∅, set w=x and continue with the next x.
  2. 5.
    If y∈V ₂ and w≠∅, we find no bridges. Halt.
6.
If w=∅, then the graph is disconnected. Return.
7.
If there is a single edge connecting v to w, it is a bridge. Contract it and return.
8.
If there is a single edge 〈w,u〉 with u∈V ₂, then it is a bridge; contract it and return.
9.
No bridges have been found. Terminate.

This code requires a data structure which supports efficient enumeration of distinct vertices. A mere list of edges (which may be multi-edges) is not enough. With this data structure, this code runs in time $O(V_{1}^{2})$. Because Algorithm 2 only conducts this search if the degree of v is bounded by a constant $d_{\operatorname{max}}$, the total cost of these searches can be regarded as O(1). In practice, we did not implement this data structure; this abandons the guarantee on maximum running time but in practice is faster.

There is one further improvement we can make here. If in either step (8) or (9) there is a double edge e ₁,e ₂ such that simultaneously removing e ₁ and e ₂ disconnects the graph, then we can simplify the graph by contracting the two edges e ₁ and e ₂ and inserting a loop of weight w(e ₁)+w(e ₂).

Appendix B: Estimating Variance

In Sect. 6, we examined worst-case variance across all possible graphs of a given size. In this section, we consider variance from a different angle. Suppose we are given a graph G and use the algorithms we have described to estimate its reliability generator. How reliable are these estimates? In this section, we describe a method for estimating relative variance.

To begin, let us consider top-down algorithms such as Algorithm 0 or 1. In such algorithms, we may remove an edge e with probability P(e); if the edge is removed, then the corresponding estimate is multiplied by 1/P(e). P(e) is essentially the importance function attached to each edge. For Algorithm 0, P(e)=1/m for all edges; for Algorithm 1, P(e)=1/|D(G)| if e is a non-bridge, and P(e)=0 otherwise.

In such cases, the straight-forward way to estimate E[F(G)] and E[F(G)²] would be to take T samples F(G) and extract the sample mean and variance. Let $\hat{\mu}$ and $\hat{\sigma}^{2}$ be the sample estimates obtained in this way. We have that

so these estimates may be very inaccurate when σ is large. For the high-order coefficients of the reverse reliability generator, this makes it exponentially expensive to estimate μ,σ accurately. This is a very dangerous situation: when the estimates F(G) are very inaccurate, it is also very difficult to determine how inaccurate they are.

A better method of estimating accuracy is to estimate E[F(G)] and E[F(G)²] using importance sampling. Instead of choosing the edge e to remove with the indicated probability P(e), we use an alternative importance-sampling probability distribution

$$P'(e) = \frac{\mbox{\# of spanning trees of $G-e$}}{E-V+1}. $$

We emphasize that there are two distinct probability distributions in play here. P(e) is the probability distribution used by algorithm F itself; P′(e) is the probability distribution we are imposing in order to measure algorithm F.

This gives us the following SIS algorithm F ^(d) for estimating E[F(G)^d]:

Algorithm

F ^(d)

1.
Set m ₀=1
2.
For k=1,…,K
1. 3.
  Choose an edge e∈G with probability P′(e).
2. 4.
  Set $m_{k} \leftarrow m_{k-1} \frac{1}{P(e)^{d-1} P'(e)}$. Set G←G−e.
5.
Return the estimate
$$\hat{r}^{(k)} = \sum_{k=0}^{K} \frac{m_k x^k}{(k!)^d} $$

It is easy to verify that E[F ^(d)(G)]=E[F(G)^d].

The importance function P′(e) ensures that algorithm F ⁽¹⁾ estimates the high-order coefficient of E[F(G)] (the number of spanning trees of G) exactly (variance zero), although the estimates for the lower-order coefficients have higher variance. The estimate of the high-order coefficients of E[F(G)]² also has greatly improved variance, although the low-order coefficients are worse. We believe it is much safer to have an error estimate which is accurate when the error is large.

The estimate of σ ² provided by F ⁽²⁾ is typically the right order of magnitude, but for low-order coefficients it is not very precise. The sample variance is a better estimator for these low-order coefficients. To use the sample variance safely, we first estimate σ ² using F ⁽²⁾ for all coefficients. For those coefficients for which $\hat{\sigma}^{2}$ is sufficiently small, we replace our estimate of σ ² with the sample variance.

Note that to use this method, we must compute the importance function P′(e) for all edges e∈G at every stage of the recursion. To describe how this is done, we begin by recalling the Kirchoff matrix-tree theorem used to count the number of spanning trees of a graph G. Let A _G be the adjacency matrix of G, and let D be a diagonal matrix whose ith entry is the degree of vertex v _i. The Kirchoff formula states that κ(G)=detL, where L=(D−A _G)₁₁ denotes the minor of D−A _G obtained by removing the first row and column.

The SIS algorithm F ^(d) would appear to be very expensive if it computed κ naively from the Kirchoff formula. There are m iterations, in each of which we must κ(G−e) for each edge. Evaluating a determinant costs O(n ³) work and O(n ²) memory. In all, naive computations would cost O(m ² n ³) work and O(n ²) memory.

Reference [4] describes a variety of algorithms to compute κ and to update this structure as edges of G are removed. We will discuss one variant of those algorithms. The main idea is to keep track of λ(e)=κ(G−e)/κ(G) for each edge e∈G. These are precomputed for the original graph G. Next, when we update G by removing an edge e, we must update λ to the new λ′. We will find the following notation convenient. If e=〈i,j〉 is any edge in G, we define the column vector δ _e to be the vector which is +1 in coordinate i, is −1 in coordinate j, and is zero elsewhere. Observe that when edge e is removed from G, the matrix L changes by $\delta_{e} \delta_{e}^{T}$:

$$L_{G - e} = L_{G} + \delta_e \delta_e^T $$

Now let us examine how to update λ′:

As described in [4], this leads to the following technique for updating λ. Whenever we remove edge e from G, we use a sparse matrix inversion algorithm such as conjugate gradient to compute $u = L^{-1}_{G} \delta_{e}$. We then update λ(d) for each edge d∈G by

$$\lambda(d) \leftarrow\lambda(d) - (\delta_d^T u)^2 / \lambda(e) $$

In total, this technique allows to reduce the memory requirement to O(m) and the complexity to O(mn). Furthermore, in practice the conjugate gradient method needs far less than m n iterations to compute an adequate approximation, so in practice the work requirement per iteration is more like ≈mlogn. This is still much more expensive than computing F itself, however it allows us to achieve accurate estimates for E[F(G)] and E[F(G)²] on graphs of interest. In many cases, straightforward sampling of F itself would have required infeasible computation to obtain accurate results.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Harris, D.G., Sullivan, F. & Beichl, I. Fast Sequential Importance Sampling to Estimate the Graph Reliability Polynomial. Algorithmica 68, 916–939 (2014). https://doi.org/10.1007/s00453-012-9703-x

Download citation

Received: 23 September 2011
Accepted: 17 October 2012
Published: 08 November 2012
Issue Date: April 2014
DOI: https://doi.org/10.1007/s00453-012-9703-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast Sequential Importance Sampling to Estimate the Graph Reliability Polynomial

Abstract

Access this article

Similar content being viewed by others

Sequential Importance Sampling for Estimating the Number of Perfect Matchings in Bipartite Graphs: An Ongoing Conversation with Laci

Sequential Monte Carlo for counting vertex covers in general graphs

Scalable and exact sampling method for probabilistic generative graph models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Radius-Two Search for Bridges

Appendix B: Estimating Variance

Algorithm

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast Sequential Importance Sampling to Estimate the Graph Reliability Polynomial

Abstract

Access this article

Similar content being viewed by others

Sequential Importance Sampling for Estimating the Number of Perfect Matchings in Bipartite Graphs: An Ongoing Conversation with Laci

Sequential Monte Carlo for counting vertex covers in general graphs

Scalable and exact sampling method for probabilistic generative graph models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Radius-Two Search for Bridges

Appendix B: Estimating Variance

Algorithm

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation