A general Evolutionary Framework for different classes of Critical Node Problems

https://doi.org/10.1016/j.engappai.2016.06.010Get rights and content

Abstract

We design a flexible Evolutionary Framework for solving several classes of the Critical Node Problem (CNP), i.e. the maximal fragmentation of a graph through node deletion, given a measure of connectivity. The algorithm uses greedy rules in order to lead the search towards good quality solutions during reproduction and mutation phases. Such rules, which are only partially reported in the literature, are generalised and adapted to the six different formulations of the CNP considered along the paper. The link between solutions of different CNP formulations is investigated, both quantitatively and qualitatively. Furthermore, we provide a comparison with best known results when those are available in literature that confirms the good overall quality of our solutions.

Introduction

The Critical Node Problem (CNP) is a class of Interdiction Network Problems (Wollmer, 1964, Wood, 1993) that focuses on maximally fragmenting a graph G(V,E) by deleting a set SV of its nodes (and all incident edges on such nodes). This problem is of interest in a wide range of possible situations, including the identification of key players in a social network (Borgatti, 2006), transportation networks' vulnerability (Jenelius et al., 2006), power grid construction and vulnerability (Salmerón et al., 2004), homeland security (Brown et al., 2006), telecommunications (Alevras et al., 1997) or epidemic control (Zhou et al., 2006) and immunisation strategies (Arulselvan et al., 2009, Cohen et al., 2003, Ventresca, 2012). A possible application to computational biology, through the example of protein–protein interaction networks, has been suggested in Boginski et al. (2009).

Each domain of application usually defines a specific version of the problem through the use of a particular connectivity measure. Moreover, solving real graphs with up to thousands of nodes often calls for the use of an efficient heuristic algorithm. The contribution of the approach advocated here is twofold: on one hand, it provides a global and flexible framework that allows us to deal with different fragmentation measures. On the other hand, it can find good quality solutions with limited costs in terms of algorithmic implementation and computational effort. To the best of the authors' knowledge, this is the first attempt to develop a general tool for tackling different classes of the CNP.

We will represent a solution by the set of its deleted nodes S. The degree of fragmentation of the induced graph G[VS] needs to be measured by a given connectivity metric. We will consider only undirected graphs and we denote the set of maximal connected components as H and the cardinality of the said components as |h| for hH.

Many connectivity measures can be devised according to the type of application desired. We will concentrate on the measures that take into account the number of remaining connected components and their cardinality after the deletion of set S, which is generally enough to determine which nodes are still able to interact in the remaining network. These measures are defined as (i) pair-wise connectivity, i.e. the number of pair of nodes connected by a path inside the graph, (ii) the size of the largest connected component and (iii) the number of connected components. The value of these three measures for a solution set S will be expressed, respectively, through the following mathematical functions:f(S)=|{i,jVS:iandjconnectedbyapathinG[VS]}|,C(S)=max{|h|,hH(G[VS])},H(S)=|H(G[VS])|.

Pair-wise connectivity f(S) can alternatively be expressed in terms of the cardinality of the maximal connected components: f(S)=hH|h|(|h|1)2. Even though these measures are all different and can lead to very different optimal solutions, as explicitly demonstrated in Shen and Smith (2012), they are not generally unrelated. For example the ideal situation for minimising the pair-wise connectivity is to obtain the largest number of connected components H(S) with the smallest possible variance in their cardinality. This implies a minimisation of the size of the largest component. In practice, this means that disrupting pair-wise connectivity f(S) is a tradeoff between minimising the cardinality of the largest component C(S) and maximising the number of connected components H(S). As the nodes are removed or disabled, we do not count them as single components. An example of the fragmentation of a small graph is provided in Fig. 1: after the removal of two nodes (numbers 1 and 2), the graph is split into two connected components of five nodes each. This solution corresponds to the optimal solution when trying to either minimise f(S) and C(S) or maximise H(S) by removing at most two nodes from the graph, with corresponding values: f({1,2})=20, C({1,2})=5 and H({1,2})=2.

Given a connectivity measure, a CNP solution is defined by the set of deleted nodes and the value of the connectivity metric for the resulting graph. Depending on the problem at hand, the selection of the nodes can be performed using two complementary approaches:

  • the budget constrained formulation: minimise/maximise the connectivity under a budget limitation over S(|S|K);

  • the connectivity constrained formulation: minimise the number of nodes deleted (|S|)) in order to meet a threshold connectivity value.

For the sake of clarity, we will refer to the problems with the different connectivity measures f(S), C(S) and H(S) as CNP1, CNP2 and CNP3, respectively. For each problem, we consider the two variants of the CNP that arise taking into account both the budget (“a”) and connectivity (“b”) constrained formulations, that is

  • CNP1a: minimise f(S) (pair-wise connectivity) subject to |S|K.

  • CNP1b: minimise |S| such that f(S)P.

  • CNP2a: minimise C(S) (cardinality of the largest connected component of G[VS]) subject to |S|K.

  • CNP2b: minimise |S| such that C(S)L (L denotes the cardinality parameter in accordance with notations in Boginski et al., 2009, Arulselvan et al., 2011, Veremyev et al., 2014a).

  • CNP3a: maximise H(S) (number of connected components of G[VS]) subject to |S|K.

  • CNP3b: minimise |S| such that H(S)N.

In this paper we will consider the 6 different types of the CNP problem accordingly to the above taxonomy. Handling each of these formulations through the use of a single algorithmic framework is not straightforward. For instance, the VNS algorithm provided in Aringhieri et al. (2016b) for CNP1a, which provides good results compared to other heuristics for that problem, is hard to generalise even to the CNP1b. One main reason is the fact that finding feasible solutions for “b” types of the CNP is potentially very difficult, posing a relevant challenge for implementing the classical shaking procedures in a VNS framework and in general for the exploration of the solution space. Another important difficulty concerns the application of local search approaches. In order to improve the objective value of an instance of CNP1b, a local search procedure should involve a swap of a node from VS with at least two nodes from S, which would increase the complexity of a move by a factor K/2 compared to the “a” version (more details about local search procedures for the CNP are provided in Section 3.5). Furthermore, the development of efficient neighbourhoods is also challenging, as discussed in Aringhieri et al. (2016b).

We will demonstrate how our Evolutionary Framework (EF) can tackle any of the six problems above by using tailored reproduction and mutation operators capable of repairing the solutions through appropriate greedy rules (preliminary results of such a framework can be found in Aringhieri et al., 2016a). Such rules can effectively guide the search through the solution space, in particular when they are properly combined as pointed out by the previous work of Addis et al. (2016).

Based on the considerations above, the aim of this work is to provide a simple and easy to implement algorithmic framework that can tackle many different versions of the CNP by embedding suitable and efficient greedy rules.

Table 1 reports the main heuristic algorithms in the literature for the different types of the CNP considered. CNP1a has gained more attention, while there exists a gap in the literature for the other five versions. We further extend the analysis of the CNP to these versions and propose a set of benchmark results which may constitute an interesting basis for comparison for future algorithms.

The paper is organised as follows. Section 2 introduces the greedy rules adopted as well as some greedy algorithms that will be used for comparison. Section 3 describes a general evolutionary algorithm for the different types of the CNP as defined above, embedding the greedy rules defined in Section 2 within the tailored reproduction and mutation operators. Section 4 discusses the results of the evolutionary algorithm over a set of benchmark instances and investigates the correlation between solutions of the different types of the CNP. Finally Section 5 provides conclusions and remarks. The remainder of this section gives a brief overview of the existing literature on the CNP.

A pseudo-approximation algorithm is proposed in Dinh et al. (2010) to select a set of nodes S whose deletion will lower pair-wise connectivity under a certain threshold (CNP1b). The minimisation of pair-wise connectivity through the deletion of K nodes (CNP1a) is investigated in numerous works. Its NP-completeness is proved in Arulselvan et al. (2009), Di Summa et al. (2011) and Addis et al. (2013) while Arulselvan et al. (2009) also proposes a greedy algorithm and an ILP formulation (with O(|V|3) constraints). A generally more efficient ILP formulation with a potentially non-polynomial number of constraints is presented in Di Summa et al. (2012) while the more recent work of Veremyev et al. (2014a) proposes an alternative more compact formulation with only O(|V|2) constraints. Several heuristic algorithms exist, based on the use of greedy rules (Ventresca and Aleman, 2015, Addis et al., 2016) with interesting results or metaheuristic methods, such as Simulated Annealing and Population Based Incremental Learning (Ventresca, 2012) or Iterated Local Search and Variable Neighbourhood Search (Aringhieri et al., 2015, Aringhieri et al., 2016b). Approximation algorithms have also been proposed (Ventresca and Aleman, 2014a, Ventresca and Aleman, 2014b) but with limited applicability since they are based on an ILP formulation with O(|V|3) constraints. Polynomiality of the CNP1a over trees is established in Di Summa et al. (2011) and extended to graphs with bounded tree-width (and to the CNP2a and the CNP3a) in Addis et al. (2013).

The CNP2b has been introduced in Boginski et al. (2009) and Arulselvan et al. (2011): it seeks the smallest set S inducing a graph G[VS] whose largest component is smaller than a given threshold L (C(S)L). An ILP formulation is given along with a greedy algorithm very similar to the one of Arulselvan et al. (2009) and a genetic algorithm. A more compact linear model is proposed in Veremyev et al. (2014a) and polynomiality of the CNP2b over trees and proper interval graphs is established in Lalou et al. (2016). The works of Shen and Smith (2012) and Shen et al. (2012) study the versions where a set of nodes with maximum cardinality |S|K is deleted to, respectively, minimise the size of the largest connected component C(S) (CNP2a) or maximise the number of connected components H(S) (CNP3a). Exact algorithms are proposed for both versions, including dynamic programming approaches as well as ILP models. Contrary to the version that considers pair-wise connectivity minimisation, few efficient heuristics to tackle real world graph instances have been developed in the literature for other connectivity measures. We note that a very general mathematical model for dealing with Critical Node-Edge Deletion Problems has been proposed in Veremyev et al. (2014b), providing a linear formulation for any connectivity measure that uses the number of connected components and their size, with a limited number of variables and constraints (O(|V|2)). As a side note, it can be observed that formulations CNP2b and CNP3b have strong ties to the Vertex Separator Problem (Balas and de Souza, 2005), which in its simplest form seeks to find the smallest possible set of nodes whose removal fragments the graph in two balanced components. Recent examples of exact and heuristic approaches for the Vertex Separator Problem can be found in Cavalcante and de Souza (2011) and Sánchez-Oro et al. (in press).

To summarise, the literature does provide heuristic algorithms able to tackle real graphs with up to thousands of nodes but only for two out of the six CNP types defined above (CNP1a and CNP2b). At the same time, the applicability of the ILP models is in general limited to small graphs. Although we will focus in this paper on the three connectivity metrics detailed above, we remark that many alternative ways to quantify a graph's fragmentation can be used, for example: the network's diameter (Albert et al., 2000), single/multiple-commodity maximum flow or the shortest path between given source–sink node pairs (Grubesic and Murray, 2006, Matisziw and Murray, 2009, Cormican et al., 1998, Lim and Smith, 2007, Veremyev et al., 2015).

Section snippets

Greedy rules and algorithms

The Evolutionary Framework we propose relies on the use of suitable greedy rules with the aim of providing an efficient and flexible tool for dealing with the different types of CNP. These greedy rules are embedded in the initialisation, reproduction and mutation phases of the evolutionary algorithm. For each type of CNP, we will discuss two complementary types of greedy rules that allow us to generate heuristic solutions quickly. These rules are based on moving a node from the set S of deleted

An evolutionary algorithm for the CNP

In this section we present a flexible Evolutionary Framework that can be applied to any of the CNP types discussed so far. Although Arulselvan et al. (2011) have designed a genetic algorithm to deal with the CNP2b, the features of their algorithm are quite different from the characteristics of our approach. More specifically, we make use of greedy rules for repair operations during reproduction and mutation phases. This is one of the key features of the framework presented in this work as it

Numerical results

In this section, we will present extensive results over two benchmark sets of instances to demonstrate the overall quality of the solutions found by our genetic algorithm. When previous results exist in the literature (i.e. for the CNP1a), we will compare our results with the best known results, otherwise we will use the greedy algorithms of Section 2 to show that our algorithmic framework provides an added value with respect to the independent use of the greedy procedures.

Conclusions

We presented a general Evolutionary Framework to solve a general problem known as the Critical Node Problem. Our framework is based on a simple genetic algorithm structure that makes use of appropriate greedy rules to repair and correct the solutions during the reproduction and mutation phases. The proposed hybrid heuristic is quickly adaptable to several formulations of the problem since only the criteria for the greedy rules have to be redesigned and implemented for a new formulation. We

Acknowledgements

The authors would like to thank Valentina Cacchiani, Francesca Cordero, Guglielmo Guastalla and Mario Ventresca for providing or indicating useful real graphs of interest for this work. We would also like to thank two anonymous referees for their invaluable help in improving the overall clarity and consistency of the paper. This work was supported by a Google Focused Grant on Mathematical Programming, project “Exact and Heuristic Algorithms for Detecting Critical Nodes in Graphs”.

References (55)

  • M. Ventresca et al.

    A derandomized approximation algorithm for the critical node detection problem

    Comput. Oper. Res.

    (2014)
  • R.K. Wood

    Deterministic network interdiction

    Math. Comput. Model.

    (1993)
  • B. Addis et al.

    Hybrid constructive heuristics for the critical node problem

    Ann. Oper. Res.

    (2016)
  • R. Albert et al.

    Error and attack tolerance of complex networks

    Nature

    (2000)
  • Alevras, D., Grötschel, M., Wessäly, R., 1997. Capacity and Survivability Models for Telecommunication Networks....
  • Aringhieri, R., Grosso, A., Hosteins, P., 2016a. A genetic algorithm for a class of critical node problems. In:...
  • Aringhieri, R., Grosso, A., Hosteins, P., Scatamacchia, R., 2015. VNS solutions for the critical node problem. In:...
  • R. Aringhieri et al.

    Local search metaheuristics for the critical node problem

    Networks

    (2016)
  • Arulselvan, A., Commander, C.W., Shylo, O., Pardalos, P.M., 2011. Cardinality-constrained critical node detection...
  • E. Balas et al.

    The vertex separator problema polyhedral investigation

    Math. Program.

    (2005)
  • V. Boginski et al.

    Identifying critical nodes in protein–protein interaction networks

  • S.P. Borgatti

    Identifying sets of key players in a social network

    Comput. Math. Organ.

    (2006)
  • G. Brown et al.

    Defending critical infrastructure

    Interfaces

    (2006)
  • V.F. Cavalcante et al.

    Exact algorithms for the vertex separator problem in graphs

    Networks

    (2011)
  • R. Cohen et al.

    Efficient immunization strategies for computer networks and populations

    Phys. Rev. Lett.

    (2003)
  • K.J. Cormican et al.

    Stochastic network interdiction

    Oper. Res.

    (1998)
  • M. Di Summa et al.

    Branch and cut algorithms for detecting critical nodes in undirected graphs

    Comput. Optim. Appl.

    (2012)
  • Cited by (58)

    View all citing articles on Scopus
    View full text