Multi-Node Hub (MNH) Parameters:
Input: An undirected graph G such that , and positive integers k and p.
Question: Does there exist a subset of size exactly k such that is connected, and contains at least p edges with exactly one endpoint in A?
Hubs are high-degree nodes within a network. The examination of the emergence and centrality of hubs lies at the heart of many studies of complex networks such as telecommunication networks, biological networks, social networks and semantic networks [28], [5], [1], [22], [38]. For example, hubs have been examined in the context of complex phylogenetic diseases such as asthma [36]; indeed, deletion of a hub protein is more likely to be lethal than deletion of a non-hub protein [24], [27]. Furthermore, identifying and allocating hubs are routine tasks in a wide-variety of applications [6]. In fact, the concept of hubs is also of significant interest in studies of hyperlinked environments [30], viral marketing [33], outbreaks of epidemics [4] and the Blogosphere [45].
It is natural to ask whether one can not only efficiently find a hub that is merely a single node, but a hub that consists of at most k nodes or (more generally) exactly k nodes.3 More precisely, given a threshold parameter p, a hub can be defined as a node of degree at least p. Here, we introduce the concept of a multi-node hub, which is a connected set of nodes of degree at least p. In other words, we aim to identify k nodes which can act as a unit (due to the connectivity constraint) that is a hub (due to the cut constraint). For example, in the context of biological networks, we might want to seek a module or a protein complex that acts as a hub, and in the context of social networks, we might want to identify a group of people that together act as a hub. Formally, we solve the following problem.
Multi-Node Hub (MNH) Parameters: Input: An undirected graph G such that , and positive integers k and p. Question: Does there exist a subset of size exactly k such that is connected, and contains at least p edges with exactly one endpoint in A?
Observe that the edges emanating from the multi-node hub may have joint endpoints outside the multi-node hub. One may ask to compute a subset of size exactly k such that is connected and the number of vertices in that have a neighbor inside A is at least p. Our objective, compared to this latter question, is to maximize the number of links/connections emanating from the multi-node hub rather than the number of its neighbors, which stands in par with the aforementioned applications—e.g., consider the situation where we have three proteins in our multi-node hub that are each connected to the same three proteins outside it, compared to the situation that a is connected only to x, b only to y, and c only to z. In the first situation, the multi-node hub appears to be more central. Of course, one may further argue that a situation where have 9 distinct neighbors is the most desirable, but such a structure may not exist in our network, and simply demanding to have 9 edges, while being much less restrictive, may suffice for determination of centrality. Further, also in cases of networks susceptible to link failures rather than node failures, optimizing the number of edges makes more sense. That being said, we find the latter question interesting, and can be a topic for future research.
Multi-node hubs are of interest also for large values of k: when k is close to n, a multi-node hub allows us to “shave” a few nodes from the network while ensuring that the core stays connected and has a large number of links to the nodes that were shaved. In other words, we aim to identify a very large chunk of the graph that is connected on its own as well connected to its few outliers. This can be a common scenario when we think of MNH as a variant of Max-Cut (defined below)—for graphs that are relatively dense, when we simply seek a maximum cut, we do expect that each side of the cut will be relatively large. Now, we may still want our primary objective to be that of maximizing the cut size, even (or, perhaps, on purpose) when it means that k will be large. We remark that, in fact, a significant part of our work is dedicated to the case where , which is specifically interesting when .
Moreover, we believe that multi-node hubs deserve to be studied also because they are simple, elegant combinatorial objects. The MNH problem involves three different types of constraints, namely, a cut constraint, a size constraint and a connectivity constraint. On the one hand, it is easily seen that MNH is W[1]-hard with respect to k; that is, it is unlikely that MNH is FPT with respect to k. On the other hand, it can be shown to be FPT with respect to (by a simple application of color-coding [3]). This brings us to the parameter p. By relying on a preprocessing phase inspired by algorithms for the Max Leaf Spanning Tree problem along with a novel application of special tree decompositions, namely “unbreakable tree decompositions”, we are able to establish that MNH is FPT with respect to the parameter p.
Related work: MNH is closely related to the classical NP-hard Max-Cut problem. Here, the input consists of a graph G and a positive integer p, and the objective is to check whether there is a cut of size at least p. A cut of a graph is a partition of the vertices of the graph into two disjoint subsets. The size of the cut is the number of edges whose endpoints belong to different subsets of the partition. Max-Cut has been the focus of extensive study from the algorithmic perspective of computer science as well as the extremal perspective of combinatorics. Papadimitriou and Yannakakis showed that Max-Cut is APX-hard [41]. It is also well known that given an arbitrary partition of the vertex set of G, by repeatedly moving vertices from one subset of the partition to the other as long as the size of the cut increases, one obtains a cut of size at least . A breakthrough result by Goemans and Williamson [23] gave a 0.878-approximation algorithm, which is optimal under the Unique Games Conjecture [29]. Max-Cut has also been well studied from the viewpoint of parameterized complexity [12], [13], [37], [42]. In this context, a notable work is the parameterized algorithm for an above-guarantee version of Max-Cut [13].
The -Max Cut problem, which is a variant of Max Cut that has been extensively studied in the literature, is linked even more tightly to MNH. This variant asks whether there exists a subset A of the vertex set of a given graph such that and E contains at least p edges with exactly one endpoint in A. For this specific variant, Ageev and Sviridenko gave a 0.5-approximation algorithm [2], which has been slightly improved by Feige and Langberg [19]. It is also known that this variant is W[1]-hard with respect to the parameter k [9]. With respect to the parameter p, -Max Cut can be solved in time [8]. This bound was improved to [44], and to [43]. The latter work also gives a polynomial kernel for -Max Cut with respect to the parameter p. Bonnet et al. [8] also gave a parameterized approximation scheme for -Max Cut with respect to the parameter k.
Furthermore, Connected Max-Cut (CMC), the variant of Max Cut where should be a connected graph, has also been investigated in the literature. This variant is of interest, for example, in image segmentation [46]. Essentially, MNH can be defined as the variant of -Max Cut that adopts the connectivity constraint of CMC. Hajiaghayi et al. [26] studied CMC from the view point of approximation, and obtained an -approximation algorithm for this problem. They explicitly note that since CMC has the flavors of both cut and connectivity problems simultaneously, well-known approaches used to develop algorithms for either connectivity problems or cut problems such as Minimum Bisection are not applicable to CMC. Recently, Lee et al. [32] developed a 0.5-approximation algorithm for the special case of CMC where the input graph has bounded treewidth. We remark that the study of the parameterized complexity of natural connectivity variants of well-known problems, such as Connected Vertex Cover [7], [14], [17], [20], [25], [39], is an active research area that has received considerable attention.
Our contribution: In this paper, we initiate the study of multi-node hubs in general, and of algorithmic aspects of MNH in particular. Since the concept of multi-node hubs is a natural generalization of the concept of hubs, it can be of significant interest in various studies of complex networks where hubs play a central role. We believe that connectivity is the most fundamental condition to demand from multi-node hubs; in particular, it is the most basic requirement that can allow the nodes to act as a single unit. In specialized scenarios, one may want the hub to be “highly-connected” or to be structured in a manner compatible with some application-specific demands. We believe that our paper can lead to several follow-up works, not only from the viewpoint of parameterized complexity theory, on (possibly enriched) multi-node hubs.
While it is easy to observe that MNH is W[1]-hard with respect to the parameter k (Section 7), our main contribution is the first parameterized algorithm that shows that MNH is FPT with respect to p (Section 4). Our algorithm is primarily a classification result and should be viewed as a proof of concept that the problem is FPT with respect to p. As a corollary to our algorithm we also get that CMC is FPT with respect to p. For the sake of completeness, we also give a simple FPT algorithm parameterized by (Section 8).
Despite the recent breakthrough advances for cut problems like Multicut and Minimum Bisection, MNH is still very challenging. Not only does a connectivity constraint has to be handled on top of the involved machinery developed for these problems, but also the fact that MNH is a maximization problem seems to prevent the applicability of this machinery in the first place. To deal with the latter issue, we give non-trivial reduction rules that show how MNH can be preprocessed into a problem where it is necessary to delete a bounded-in-parameter number of vertices. Then, to handle the connectivity constraint, we use a novel application of the form of unbreakable tree decomposition introduced by Cygan et al. [16] to solve Minimum Bisection, where we demonstrate how connectivity constraints can be replaced by simpler size constraints. A more detailed description of this approach can be found in Section 3. We believe that our approach can be relevant to the design of algorithms for other cut problems of this nature.
Standard notation: The notation hides factors polynomial in the input size. Given a graph G, let and denote its vertex set and edge set, respectively. Given a set of graphs , let . Given a vertex , let and denote the open and closed neighborhoods of v, respectively. Let denote the maximum degree of a vertex in G. Given a set , let and denote the open and closed neighborhoods of U, respectively. That is, , and
To handle the cut constraint (), we employ the randomized contractions technique [10], for which an excellent high-level description is given by Chitnis et al. [11] (Section 1.1). Since our problem also involves a size constraint (), we also rely on Theorem 2.3 (Section 2). In short, Cygan et al. [16] showed how to replace the recursion scheme of the randomized contractions technique with a dynamic programming computation over a static tree decomposition; this enabled a
In this section we prove our main result:
Theorem 4.1 The MNH problem is solvable in time .
We first solve simple cases of MNH (Section 4.1), where either the graph G contains many vertices (more precisely, ) or there exists a solution A such that contains a vertex of high-degree. Then, we turn to focus on the difficult instances whose handling relies on the properties of an unbreakable tree decomposition (Section 4.2).
In this section we prove the following lemma.
Lemma 5.1 The Single Bag problem can be solved in time .
Since there are at most k choices for q, we can next focus on solving Single Bag for a specific . Assuming that is not −∞, let be a q-valid tuple such that .
In this section, we show that by constructing a tree decomposition using Theorem 2.3 and invoking the algorithm for Single Bag of Lemma 5.1, one can solve the MNH problem in time , proving Theorem 4.1. Recall that to this end, by Lemma 4.5, it is sufficient to solve useful instances of SMNH in time . For the sake of clarity, we repeat the details given in Section 4.2.1.
Let be a useful instance of SMNH (i.e., and ). Recall that we assume WLOG that G is a
In this section we prove a simple hardness result.
Theorem 7.1 MNH is W[1]-hard with respect to the parameter k.
Proof Let be an instance of -Max-Cut. We construct an instance of MNH as follows (see Fig. 2). We let (where will be determined below), and . Moreover, and . Now, we want to show that is a yes-instance of
Given a graph G and positive integers , we seek a subset of size exactly k such that is connected, and contains at least p edges with exactly one endpoint in A. To this end, uniformly at random, we color each vertex of G by either or . With probability at least , we obtain if there exists a solution A, then all vertices in A are blue and all vertices in R are red for some subset such that .
Let Z be the set of all the red vertices. Since the
In this paper, we initiated the study of multi-node hubs in general, and of algorithmic aspects of MNH in particular. The concept of multi-node hubs is a natural generalization of the concept of hubs, therefore it might play a central role in the analysis of complex networks. We showed that although MNH is W[1]-hard parameterized by k, it is FPT parameterized by p. Our algorithm is primarily a classification result and should be viewed as a proof of concept that the problem is FPT. While our
The two authors declare equal contribution.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
We use the Greediness of parameterization technique for the multiparameterization, which is proposed by Bonnet et al. [3], and is based on branching algorithms. This approach has recently gained much attraction in fixed cardinality parameterized problems [8,10,11]. The rest of the paper is organized as follows.
A preliminary version of this paper appeared in the proceedings of IPEC 2018.
Supported by the European Research Council (ERC) (grant agreement no. 819416), and Swarnajayanti Fellowship (no. DST/SJF/MSA01/2017-18).