Finding minimum node separators: A Markov chain Monte Carlo method
Introduction
Today’s critical infrastructures are organized in the form of a network such as the communication network, the smart electrical power grid or the water distribution system. In these networks, small node failures can have a significant impact on connectivity and lead to graph separation [2], [3], where components become more reliable as their sizes increase. For example, in a power grid, there should be a sufficient amount of power generation to serve the power loads while not every bus (frequently represented as nodes in the graph modeling power grid) can have a generator. Protecting minimum separating nodes can keep the sizes of components higher than a given threshold, thereby increasing the chance of load being served. Thus, identifying the weak parts in practical networks is a major concern in general. Graph separation can also incur further cascading failures, because the amount of resource in the remaining components (e.g., the amount of generated power) becomes so reduced that the remaining components are overloaded and subsequently failed [4]. Thus, one of the most important goals of network attackers is to separate nodes such that the sizes of connected components become small. However, attackers have resource constraints as well, and they would like to inflict the greatest harm with limited cost. A defender wants to identify these weakest points in advance. In this work, we consider the problem of finding a minimum α-separator that partitions the graph into connected components of sizes smaller than αn, where n is the number of nodes in the graph. Finding a minimum α-separator is proven to be NP-hard for a general topology for [5]. For topologies such as trees and cycles, the authors in [6] have developed polynomial-time algorithms, yet they require knowledge of the type of the graph topology in advance.
To tackle the minimum α-separator problem without prior knowledge of the graph topology, we apply a Markov chain Monte Carlo (MCMC) method. The basic idea is to solve this combinatorial optimization problem by constructing a random walk over a Markov chain, which traverses through feasible solutions, where the transitions lead the system to move to states with desired objectives (smaller α-separator in our case). In other words, the stationary distribution is higher for states with better objective values (i.e., closer to the minimum). The Markov chain Monte Carlo method has been applied in many NP-hard problems in various applications [7], [8], [9], [10], [11]. In our random walk algorithm, we additionally design a simple data structure to quickly identify the sizes of the components that vary under the random walk in each step. This allows us to further reduce the computational complexity in each step.
We then analytically characterize how long it takes to obtain the optimal solution by our algorithm. The standard metric often used in the literature is the mixing time, the convergence time until the Markov chain is close to its stationary distribution, and there are several existing works showing the polynomial convergence of the chain. For the independent set problem, [12], [13], [14] used a coupling technique to show polynomial convergence. However, conditions for guaranteeing polynomial convergence are limited. For example, in [12], the maximum node degree of a graph should be less than or equal to five, for polynomial convergence. Hence, in this paper we present an approximate analysis on the first passage time, using a hierarchical structure of the underlying Markov chain. We then provide sufficient conditions for the expected first passage time to be polynomial. As an example, we compute the expected first passage time in star-like topologies, which is O(n2), where n is the number of nodes. The conditions also include simple topologies that the sizes of minimum α-separators do not scale with the network size, e.g., line, circle, and balanced tree.
To evaluate the performance and support our analytical results, we run our random walk algorithm over various topologies including real topologies from US fiber networks of 20 Internet providers [15] and Italian power grid and Internet networks [16], [17]. In all topologies, our random walk algorithm converges to a solution within O(n3) steps, for a network with n nodes. We also compare our random walk algorithm with other heuristic algorithms including highest-degree-first and greedy algorithms, and illustrate the performance improvements we obtain. We underscore that the solution from our algorithm allows us to characterize the weakest points in the network that need to be strengthened. In real topologies, we find that attacking a dense area may not be an efficient way to partition a network graph which can lead to large cascading effects. Finally, we discuss some future directions to apply our general results to defense problems and practical networks such as power grids and water distribution systems in consideration of physical dynamics. Even though our graphical model is simple to model physical dynamics in practical systems, our work can infer initial weak points in these complex systems. As a future work, we will work on developing an algorithm to find weak points in networks with complex physical dynamics (e.g., correlated failures).
Section snippets
Related work
We classify previous work into several categories and summarize them as follows.
Network vulnerability and reliability: In the network science literature, a famous paper by Paolo et al. [18] analyzed the size of the largest component after node removals and studied the ability to resist failures depending on the type of attacks (i.e., random or targeted node attacks) and the type of networks (e.g., Erdos–Renyi random graph and the Barabasi–Albert scale-free power-law network). They showed that
System model
We consider a simple graph with . We denote as a set of neighbors of a node v ∈ V. We assume that the attack cost is homogeneous across nodes. Our results can be readily extended to the case of heterogeneous attack costs, which will be discussed later. An α-separator W for 1/n ≤ α < 1 is defined as a subset of nodes such that the sizes of all components in the graph G∖W is smaller than or equal to αn.5
Random walk algorithm
We develop a random walk algorithm over a Metropolis chain for the α-separator problem. Our random walk algorithm takes one of three actions in each step: (1) remove v from the attack set W, (2) add a vertex v in W, and (3) stay at the current state. To compute the sizes of connected components after each action in an efficient way, we apply a simple data structure and an update mechanism, which will be explained shortly. The formal algorithm description is as follows:
Our random walk algorithm
First passage time analysis
In this section, we analytically characterize how long it takes to obtain the optimal solution, e.g., first passage time to an optimal state, based on a hierarchical Markov chain under uniform node weights and a fixed cooling parameter ρ. There are several existing works such as coupling techniques to the polynomial convergence of Markov chain Monte Carlo methods [12], [13], [14]. In most problems that use coupling techniques, the transitions of a node only depend on neighboring nodes, such
Simulation results
In this section, we run our random walk algorithm and other heuristic algorithms over diverse topologies, including real topologies from US fiber networks [15] and Italian power grid [16], [17]. We compute the average first passage time to a minimum α-separator, and validate our analytical results. We also compare our random walk algorithm with other heuristic algorithms: highest-degree-first and greedy algorithms from the perspective of the largest component for a given budget (i.e., the same
Conclusion
In this paper, we developed a random walk algorithm based on a Metropolis chain to solve the minimum α-separator problem. We analyzed the first passage time of our random walk algorithm and investigated the conditions for polynomial first passage time, under the homogeneous assumption for states in the same hierarchy. The conditions include simple topologies where the sizes of minimum α-separators do not scale with the network size, e.g., star, line, circle, and balanced tree. Specifically, we
References (50)
- et al.
On markov chains for independent sets
J Algorithms
(2000) - et al.
Quantifying the resilience of community structures in networks
Reliab Eng Syst Saf
(2018) - et al.
Connectivity reliability and topological controllability of infrastructure networks: a comparative assessment
Reliab Eng Syst Saf
(2016) - et al.
Hazard tolerance of spatially distributed complex networks
Reliab Eng Syst Saf
(2017) - et al.
Evaluation of the robustness of critical infrastructures by hierarchical graph representation, clustering and Monte Carlo simulation
Reliab Eng Syst Saf
(2016) - et al.
Finding good approximate vertex and edge partitions is NP-hard
Inf Process Lett
(1992) - et al.
General network reliability problem and its efficient solution by subset simulation
Probab Eng Mech
(2015) - et al.
Residual life estimation based on a generalized wiener degradation process
Reliab Eng Syst Saf
(2014) - et al.
Development of a bayesian multi-state degradation model for up-to-date reliability estimations of working industrial components
Reliab Eng Syst Saf
(2017) - et al.
Finding minimum node separators: a Markov chain Monte Carlo method
International conference on the design of reliable communication networks (DRCN)
(2017)
Causes of the 2003 major grid blackouts in north America and Europe, and recommended means to improve system dynamic performance
IEEE Trans Power Syst
Catastrophic cascade of failures in interdependent networks
Nature
Model for cascading failures in complex networks
APS Phys Rev E
Finding small balanced separators
Proceedings of the thirty-eighth annual ACM symposium on Theory of computing
K-separator problem, Ph.D. thesis
Approximating the permanent
SIAM J Comput
The Markov chain Monte Carlo method: an approach to approximate counting and integration
Approximation Algorithms NP-hard Prob
Optimal CSMA: a survey
IEEE international conference on communication systems (ICCS)
Peer counting and sampling in overlay networks: random walk methods
Proceedings of the twenty-fifth annual ACM symposium on principles of distributed computing
Counting independent sets up to the tree threshold
Proceedings of the thirty-eighth annual ACM symposium on theory of computing
Approximately counting up to four
Proceedings of the twenty-ninth annual ACM symposium on theory of computing
Intertubes: a study of the us long-haul fiber-optic infrastructure
ACM SIGCOMM computer communication review
Modelling interdependent infrastructures using interacting dynamical models
Int J Crit Infrastruct
Robustness of interdependent networks: the case of communication networks and the power grid
IEEE GLOBECOM
Cited by (0)
- 1
A preliminary version of this work was presented at DRCN (International Conference On Design Of Reliable Communication Networks) 2017 [1].
- 2
The work has in part been funded by: a grant from the Defense Thrust Reduction Agency (DTRA) HDTRA1-14-1-0058; grants from the Army Research Office (ARO) MURI W911NF-12-1-0385; NSF CNS-1446582. The work of Joohyun Lee was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (Ministry of Education) (No. 2018R1D1A1B07045181). The work of Hyang-Won Lee was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (No. 2018R1D1A1B07048388).