New heuristic approaches for maximum balanced biclique problem
Introduction
Given a bipartite graph a biclique is a subgraph of G such that each pair (u, v) (i.e., u ∈ Ub and v ∈ Vb) is mutually adjacent. If then B is a balanced biclique of the given bipartite graph. The maximum balanced biclique problem (MBBP) aims to finding the balanced biclique with the maximum number of vertices. The MBBP problem plays a prominent role in various real-world industrial applications, including defect densities in self-assembly enabled nanotechnology [15], [16], defect tolerance for nanotechnology crossbar switches [1], [17], programmable logic array folding in VLSI theory [14], and computational biology problems such as gene expression data problem [24].
The MBBP has been proven to be NP-hard [8], [13], meaning that unless P = NP, there are no polynomial-time algorithms to solve the problem. Additionally, it is difficult to approximate the problem and state-of-the-art approximation algorithms can only achieve an approximation ratio of for some θ > 0 [6]. Because of the hardness of the MBBP, a huge amount of effort has been devoted to finding an acceptable balanced biclique within a reasonable time. To date, most practical algorithms for solving the MBBP have been heuristic algorithms.
A popular method for solving the MBBP is the node-deletion-based method [1], [16], [25], [26] , which solves the MBBP by converting the problem into a maximum balanced independent set problem in a complement bipartite graph. An early node-deletion-based algorithm for the MBBP implemented an application-independent defect tolerant design flow by removing the vertices with the maximum degree [16]. Based on [16], Al-Yamani et al. [1] designed an improved algorithm to handle larger bicliques. A key improvement in their algorithm was the removal of one vertex in an area that is adjacent to the maximum number of vertices with the minimum degree in the other area. A combination of the key ideas from the above two algorithms [1], [16] leads to a more advanced heuristic that first deletes the vertex with the minimum degree in one area and then removes the vertex with the maximum degree in the other area [25]. This has resulted in an algorithm called Alg3 in [25], which is more efficient than those in [1], [16]. Additionally, Alg3 attempts to reduce the degree of the vertex with the smallest degree in one area as in [1] and also reduces the number of edges in the bipartite graph as in [16]. A recent node-deletion-based algorithm [26] drops all vertices adjacent to the vertex with the minimum degree in each iteration, which reduces the number of major loops considerably to achieve the superior performance. It also employs the heuristics from [16] and [1].
Furthermore, a popular method for tackling hard combinatorial optimization problems is local search, which can find good solutions within reasonable time and typically remains effective for solving very large problems. Local search has been successfully applied to various combinatorial optimization problems, including the maximum satisfiability problem [5], minimum weighted vertex cover problem [10], vertex separator problem [3], graph coloring problem [28], maximum weight clique problem [19], minimum set covering problem [21], and many others. However, as far as we know, there is only one local search algorithm for solving the MBBP, which is called the evolutionary algorithm with structure mutation (EA/SM) [27]. In EA/SM, a local search combined with a repair-assisted restart process is used to solve the MBBP. The novel SM mutation operator was introduced to enhance exploration during the local search process. The SM can change the structure of solutions dynamically while keeping their size (fitness) and feasibility unchanged. Additionally, EA/SM implements a type of large mutation in the structure space of the MBBP to help the algorithm escape from local optima. A local search operator was also proposed for the EA/SM to improve the quality of solutions efficiently and a novel repair-assisted restart process was designed to repair every new solution reinitialized. According to the experiments in [27], EA/SM outperforms previous node-deletion-based algorithms [1], [16], [25], [26] on classical random benchmarks. This indicates that local search is a promising method for solving the MBBP and that it deserves further research.
In this paper, we develop a novel local search framework based on pair operations (POLS), which is different from the previous local search algorithms for the MBBP based on one-vertex operations (i.e., adding or removing a single vertex in each step). Our local search framework is based on a combination of an extension phase and restarting phase. There are two basic operations in our framework: vertex pair addition and vertex pair removal. Specifically, given a bipartite graph and candidate solution our algorithm searches for vertex pairs (u, v) where u ∉ Us and v∉Vs, such that u is adjacent to all vertices ∀vs ∈ Vs and v is adjacent to all vertices ∀us ∈ Us. If the algorithm finds such vertex pairs, it selects one pair to add to the candidate solution, which constitutes the pair addition operation. The pair removal operation selects u in one area Us of the candidate solution S and v in another area Vs, then removes this pair from the candidate solution. Another feature that distinguishes our local search algorithm from the previous local search algorithms for the MBBP is that our algorithm only searches among valid solutions, meaning it guarantees that the candidate solution S after each step is always a balanced biclique. Although the previous EA/SM local search algorithm [27] maintains a biclique during the search, it is not necessarily a balanced biclique.
We also propose four new heuristics for the MBBP. The first three deals with how to select the pairs of vertices for addition or removal and the final heuristic is a reduction rule. Based on the proposed framework and these heuristics, we develop two local search algorithms, the latter of which is an improved version of the former for massive bipartite graphs.
The first heuristic is a novel scoring function for choosing the pairs of vertices for addition. For a candidate addition pair, the scoring function takes into account both the lower and upper bounds of the size of the maximal solution extended from the current solution after adding the candidate pair. This value predicts the size of the solution that can be constructed after adding the candidate vertex pair. Thus, this scoring function is called the prediction score (pscore). Specifically, a cost-effective upper bound is proposed so that pscore can be calculated with low time complexity. Our algorithm chooses the pair of vertices for addition with the greatest pscore.
The second heuristic is a robust self-adaptive restarting (RSR) heuristic, which aims to improve local search by restarting the search if it cannot find a better solution within a certain number of steps. It may take many steps for the algorithm to find a better solution if the search stays in a poor search area containing no (or few) high quality solutions, which could waste a considerable amount of time. To avoid this drawback, we propose a self-adaptive restarting heuristic to dynamically restart the search process. Specifically, if the algorithm cannot find a better solution within a self-adaptive number of search steps, we remove certain vertex pairs from the current candidate solution so that the algorithm can search in a different direction.
The above two heuristics are used in developing a local search algorithm for the MBBP, called POLS with pscore and RSR (PSRS). We perform out experiments to compare PSRS to the state-of-the-art MBBP algorithms [1], [27] on various benchmarks from the literature, including randomly generated classical instances [27] and a broad range of massive bipartite graphs with nearly one billion edges. Experimental results demonstrate that PSRS significantly outperforms previous algorithms and improves upon the best known solution quality for certain difficult instances.
In order to improve the performance of PSRS on massive bipartite graphs, we propose two additional heuristics. The third heuristic is a two-mode perturbation (TMP) heuristic, which combines the greedy selection rule based on pscore with a randomized selection strategy. PSRS typically chooses the pair of vertices for addition with the greatest pscore. However, for massive bipartite graphs, it is very time consuming to find the pair of vertices with the greatest pscore, because there are too many candidate vertex pairs. Additionally, most real-world massive bipartite graphs are very sparse, meaning pure greedy heuristics can easily lead the search into local optima. Based on these two considerations, we improve the selection heuristic by incorporating a randomized selection strategy with a certain probability, leading to the TMP heuristic. This reduces the average cost of choosing the vertex pair for addition, and also introduces additional diversification.
The final heuristic is the k-bipartite core reduction rule (KCR), which is used to reduce the scale of massive bipartite graphs by deleting vertices that are impossible to include in any optimal solutions. This rule is based on a heuristic called the k-bipartite core, which is inspired by the heuristic of the k-core [18].
We improve the PSRS algorithm by using the third and fourth heuristics, and the resulting algorithm is called PSRS+ (PSRS with TMP and KCR), which is more effective for massive graphs. We select 17 massive bipartite graphs from [9] and use them to test the performances of EA/SM, PSRS, and PSRS+. Experiments demonstrate that PSRS+ greatly improves upon PSRS and significantly outperforms EA/SM on all massive real graphs.
We also conduct experimental analysis and additional investigations on the heuristics presented in this work. Specifically, we compare PSRS and PSRS+ with several alternative versions that operate without using one of the aforementioned heuristics. The experimental results demonstrate the effectiveness of the proposed heuristics.
In the next section, we introduce some necessary background knowledge and previous MBBP algorithms. We then propose a local search framework based on pair operations. Next, we describe our novel scoring function, self-adaptive restarting heuristic, and PSRS algorithm. We then present the TMP heuristic and KCR reduction rule, and present the PSRS+ algorithm. Sections 6 and 7 present the experimental results for PSRS and PSRS+, as well as experiments validating the effectiveness of the proposed novel heuristics on some benchmarks. Finally, we provide some concluding remarks.
Section snippets
Basic definitions and notations
Given a bipartite graph G = (U, V, E), G can be divided into two disjoint vertex sets U = {u1, u2, un} and V = {v1, v2, vm} such that every edge connects one vertex in U to one vertex in V. E={e1, e2, et} is the set of edges. The neighborhood of a vertex u ∈ U is N(u) = {v ∈ V|(v, u) ∈ E}. Similarly, the neighborhood of a vertex v ∈ V is N(v) = {u ∈ U|(v, u) ∈ E}. The degree of a vertex v is the size of its neighborhood and is denoted |N(v)|. The size of a bipartite graph is defined as
Two novel pair operations in local search for the MBBP
Local search algorithms perform searches within corresponding search spaces. The key to defining a search space is how the algorithms transform a candidate solution into a different solution.
PSRS algorithm
Based on the POLS framework, we propose an algorithm for solving the MBBP called PSRS. In this section, we introduce the PSRS algorithm and describe two of its important components.
The PSRS algorithm is outlined at a high level in Algorithm 2 and described below. In the beginning, the current candidate solution S is the empty set. The algorithm then initializes NoImprove, which denotes the number of non-improvement iterations, and iter (when NoImprove reaches iter, the RSR heuristic initiates
Two novel ideas for real massive bipartite graphs
In this section, we collect massive bipartite graphs from the Koblenz Network Collection (KONECT) [9], which contains network datasets from the areas of web science, network science, etc. We list the 17 selected real massive bipartite graphs in Table 1.
Some major characteristics of each selected instance appear in Table 1. The columns are: The names of the instances (Instance), numbers of left vertices (|V|), numbers of right vertices (|U|), total numbers of vertices (), total numbers of
Experimental results on random benchmarks
In this section, we perform extensive experiments to test the performance of our algorithm on two random benchmarks, including a classical random benchmark (30 instances) and some new massive benchmarks (90 instances), where all instances are randomly generated as in previous works [25], [26]. These instances have sizes of 250, 500, 1000, 5000, 10,000, 20,000, 30,000, and 40,000. The probability p that a particular edge exists in the given bipartite graphs has three values: p = 95%, 90%, and
Experimental results on massive benchmarks
In this section, we evaluate the performance of PSRS+TM and PSRS+TMKC on the massive bipartite graphs. For PSRS+TM and PSRS+TMKC, the time limit was 1000 s. The parameter q was set to 0.8. For every instance, each algorithm performed 10 independent runs with different random seeds. In the massive benchmark experiment, α and β were also set to 10,000 and 10, respectively.
In Table 5, for each algorithm, we list the maximum value (max), average value (avg) of 10 independent runs, and real run time
Summary and future work
This paper presented two fast local search algorithms called PSRS and PSRS+ for the MBBP. We proposed a new prediction selection strategy based on pairs of vertices and designed an addition rule to find good search spaces. Furthermore, we introduced the RSR heuristic to overcome the cycling and restart problems. Experimental results indicated that PSRS performs better than four previous state-of-the-art algorithms on all random instances in terms of quality of solution values. More importantly,
Acknowledgements
This work was supported in part by NSFC (under Grant nos. 61370156, 61503074, 61502464, 61402070, 61403077, and 61403076) and China National 973 program 2014CB340301.
References (28)
- et al.
New local search methods for partial maxsat
Artif. Intell.
(2016) The np-completeness column: an ongoing guide
J. Algorithms
(1985)- et al.
An efficient local search framework for the minimum weighted vertex cover problem
Inf. Sci.
(2016) - et al.
Led: a fast overlapping communities detection algorithm based on structural clustering
Neurocomputing
(2016) - et al.
Two efficient local search algorithms for maximum weight clique problem
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence
(2016) - et al.
Reinforcement learning based local search for grouping problems: a case study on graph coloring
Expert Syst. Appl.
(2016) - et al.
A defect tolerance scheme for nanotechnology circuits
Circuits Syst. I
(2007) Population-based incremental learning. a method for integrating genetic search based function optimization and competitive learning
Technical Report
(1994)- et al.
Breakout local search for the vertex separator problem
IJCAI
(2013) - et al.
Fast solving maximum weight clique problem in massive graphs
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9–15 July 2016
(2016)
Hardness of approximation of the balanced complete bipartite subgraph problem
Technical report
Tabu search-part i
ORSA J.Comput.
Konect: the koblenz network collection
Proceedings of the 22nd International Conference on World Wide Web
Genetic, Iterated and Multistart Local Search for the Maximum Clique Problem
Applications of Evolutionary Computing
Cited by (29)
Identifying the cardinality-constrained critical nodes with a hybrid evolutionary algorithm
2023, Information SciencesGeneral swap-based multiple neighborhood adaptive search for the maximum balanced biclique problem
2020, Computers and Operations ResearchCitation Excerpt :Very recently, Zhou and Hao (2019) presented a highly effective local search method (TSGR) integrating two graph reduction techniques to shrink the given graph within the tabu search framework. According to the computational results reported in Wang et al. (2018b) and Zhou and Hao (2019), PSRS (PSRS+) and TSGR show the best performance among the heuristic approaches for MBBP. In this work, we propose a general swap-based multiple neighborhood adaptive search SBMNAS for MBBP.
Efficient temporal core maintenance of massive graphs
2020, Information SciencesA local search algorithm with reinforcement learning based repair procedure for minimum weight independent dominating set
2020, Information SciencesCitation Excerpt :This work focuses on using heuristic algorithms to solve the MWIDS. Although the heuristic algorithms cannot guarantee the optimality of the solution that they obtain, they can guarantee high-quality solutions within a reasonable time [11,18,28,31,33]. However, there are few heuristic algorithms for solving MWIDS.
Dynamic thresholding search for minimum vertex cover in massive sparse graphs
2019, Engineering Applications of Artificial IntelligenceAn algorithm for spelling the pitches of any musical scale
2019, Information SciencesCitation Excerpt :This work presents an application of the approach of searching a solution space using a heuristic method to a task in processing musical data, which involves spelling pitches of musical scales. ( For some other recent examples of heuristic search approaches used in diverse domains, see [10,11,13,17,25]). Pitch spelling refers to the process of deciding the proper letter name for a pitch (such as, choosing among DImage 1, E(♮), F♭ for pitch-class 4), which is dependent upon the locations of other pitches around the pitch in question.