Abstract
The problem of finding large complete subgraphs in bipartite graphs (that is, bicliques) is a well-known combinatorial optimization problem referred to as the maximum-edge biclique problem (MBP), and has many applications, e.g., in web community discovery, biological data analysis and text mining. In this paper, we present a new continuous characterization for MBP. Given a bipartite graph \(G\), we are able to formulate a continuous optimization problem (namely, an approximate rank-one matrix factorization problem with nonnegativity constraints, R1N for short), and show that there is a one-to-one correspondence between (1) the maximum (i.e., the largest) bicliques of \(G\) and the global minima of R1N, and (2) the maximal bicliques of \(G\) (i.e., bicliques not contained in any larger biclique) and the local minima of R1N. We also show that any stationary points of R1N must be close to a biclique of \(G\). This allows us to design a new type of biclique finding algorithm based on the application of a block-coordinate descent scheme to R1N. We show that this algorithm, whose algorithmic complexity per iteration is proportional to the number of edges in the graph, is guaranteed to converge to a biclique and that it performs competitively with existing methods on random graphs and text mining datasets. Finally, we show how R1N is closely related to the Motzkin–Strauss formalism for cliques.
Similar content being viewed by others
Notes
The first-order stationarity condition of \(\text{ R1N }_d(G)\) for variables \(v\) is given by \(v = \max \left( 0,{M^Tu}/{||u||_2^2}\right) \), see Sect. 3.3. Therefore, local and global minimizers of \(\text{ R1N }_d(G)\) must satisfy this condition, hence they exactly correspond to the local and global minimizers of the problem in the new variables \((x,y)\).
By Wedin’s theorem (cf. matrix perturbation theory [17]), singular subspaces of \(M\) associated with a positive singular value depend continuously on \(d\).
In practice, we used a safety procedure which reduces the value of \(d\) whenever \(u\) or \(v\) is set to zero and reinitializes \(u\) and \(v\) to their previous value.
Additional tweaking of parameters MIPFocus, Heuristics, PreQLinearize, MIQCPMethod and RINS did not lead to better results.
References
Alexe, G., Alexe, S., Crama, Y., Foldes, S., Hammer, P., Simeone, B.: Consensus algorithms for the generation of all maximal bicliques. Discret. Appl. Math. 145(1), 11–21 (2004)
Bomze, I.: Evolution towards the maximum clique. J. Glob. Opt. 10, 143–164 (1997)
Ding, C., Li, T., Jordan, M.: Nonnegative matrix factorization for combinatorial optimization: spectral clustering, graph matching, and clique finding. In: IEEE International Conference on Data Mining, pp. 183–192 (2008)
Ding, C., Zhang, Y., Li, T., Holbrook, S.: Biclustering protein complex interactions with a biclique finding algorithm. In: IEEE International Conference on Data Mining, pp. 178–187 (2006)
Dolan, E., Moré, J.: Benchmarking optimization software with performance profiles. Math. Prog. Ser. A 91, 201–213 (2002)
Gibbons, L., Hearn, D., Pardalos, P., Ramana, M.: Continuous characterizations of the maximum clique problem. Math. Oper. Res. 22(3), 754–768 (1997)
Gillis, N.: Nonnegative Matrix Factorization: Complexity, Algorithms and Applications. Ph.D. Thesis, Université catholique de Louvain (2011)
Gillis, N., Glineur, F.: Nonnegative Factorization and The Maximum Edge Biclique Problem (2008). CORE Discuss. pap. 2008/64
Golub, G., Van Loan, C.: Matrix Computation, 3rd edn. The Johns Hopkins University Press, Baltimore (1996)
Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gauss–Seidel method under convex constraints. Oper. Res. Lett. 26, 127–136 (2000)
Gurobi Optimization, I.: Gurobi Optimizer Reference Manual (2012). http://www.gurobi.com
Lehmann, S., Schwartz, M., Hansen, L.: Biclique communities. Phys. Rev. E 78(1), 016108 (2008)
Liu, G., Sim, K., Li, J.: Efficient Mining of Large Maximal Bicliques, Lect. Notes in Comput. Sci. pp. 437–448. Springer, Berlin (2006)
Motzkin, T., Strauss, E.: Maxima for graphs and a new proof of a theorem of Turan. Can. J. Math. 17, 533–540 (1965)
Peeters, R.: The maximum edge biclique problem is NP-complete. Discret. Appl. Math. 131(3), 651–654 (2003)
Prelic, A., Bleuler, S., Zimmermann, P., Wille, A., Buhlmann, P., Gruissemb, W., Hennig, L., Thiele, L., Zitzler, E.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9), 1122–1129 (2008)
Stewart, G., Sun, J.G.: Matrix Perturbation Theory. Academic Press, San Diego (1990)
Zhong, S., Ghosh, J.: Generative model-based document clustering: a comparative study. Knowl. Inf. Syst. 8(3), 374–384 (2005)
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper presents research results of the Belgian Network DYSCO (Dynamical Systems, Control, and Optimization), funded by the Interuniversity Attraction Poles Programme initiated by the Belgian Science Policy Office. The first author is a postdoctoral researcher with the Fonds de la Recherche Scientifique-FNRS (F.R.S.-FNRS).
Appendices
Appendix A: Proof of Theorem 1
Let us show that \(B(G) \subseteq \mathcal{L }_d(G)\) for any \(d \ge \max (m,n)\). Let \(uv^T \in B(G)\), with \(u\) and \(v\) binary without loss of generality. The binary rank-one matrix \(uv^T\) belongs to \(\mathcal{L }_d(G)\) if and only if there exists \(\epsilon > 0\) such that for all \(x \in \mathcal{B }_+(u,\epsilon )\) and \(y \in \mathcal{B }_+(v,\epsilon )\), we have \(||M-uv^T||_F^2 \le ||M-xy^T||_F^2\).
Let then \(x \in \mathcal{B }_+(u,\epsilon )\) and \(y \in \mathcal{B }_+(v,\epsilon )\), and let us note \(S_u, S_v, S_x\) and \(S_y\) the supports of \(u, v, x\) and \(y\), respectively. For \(\epsilon < 1\), since \(u\) and \(v\) are binary, we have \(S_u \subseteq S_x\) and \(S_v \subseteq S_y\) (i.e., \(u_i = 1 \Rightarrow x_i > 0\) and \(v_j = 1 \Rightarrow y_j > 0\)). This implies that for \(\epsilon < 1, ||M-uv^T||_F^2 \le ||M-xy^T||_F^2\) if and only if
Let us note \(\bar{S_u} = S_x \backslash S_u\) and \(\bar{S_v} = S_y \backslash S_v\). Since \(x \in \mathcal{B }_+(x,\epsilon )\), there exists \(\delta u\) such that \(x = u + \epsilon \delta u\) with \(||\delta u||_2 \le 1\) and \(\delta u(\bar{S_u}) \ge 0\) since \(u(\bar{S_u}) = 0\); symmetrically there exists \(\delta v\) such that \(y = v + \epsilon \delta v\) with \(||\delta v||_2 \le 1\) and \(\delta v(\bar{S_v}) \ge 0\).
Let us analyze the four submatrices of \(M(S_x,S_y)\) corresponding to the decomposition \(S_x = S_u \cup \bar{S_u}\) and \(S_x = S_u \cup \bar{S_u}\).
-
1.
Submatrix \((S_u,S_v)\). Since \(M(S_u,S_v) = \mathbf{1}_{|S_u|\times |S_v|}, u({S_u}) = \mathbf{1}_{|S_u|}\) and \(v({S_v}) = \mathbf{1}_{|S_v|}\),
$$\begin{aligned} e_1 = ||M(S_u,S_v)-x(S_u)y(S_v)^T||_F^2 \ge ||M(S_u,S_v)-u(S_u)v(S_v)^T||_F^2 = 0. \end{aligned}$$ -
2.
Submatrix \((\bar{S_u},\bar{S_v})\). Since \(u(\bar{S_u}) = 0, v(\bar{S_v}) = 0\) and \(||M(\bar{S_u},\bar{S_v})||_F^2 \le |\bar{S_u}||\bar{S_v}|d^2 \le mnd^2\) for \(d \ge 1\),
$$\begin{aligned}&e_2 = ||M(\bar{S_u},\bar{S_v})-x(\bar{S_u})y(\bar{S_v})^T||_F^2 = ||M(\bar{S_u},\bar{S_v}) - \epsilon ^2 \delta u(\bar{S_u}) \delta v(\bar{S_v})^T||_F^2 \\&\quad ||\delta u(\bar{S_u}) \delta v(\bar{S_v})^T||_F^2. \end{aligned}$$In fact, recall that \(||A-B||_F^2 = ||A||_F^2 - 2 \sum _{ij} A_{ij} B_{ij} + ||B||_F^2 \ge ||A||_F^2 - 2 ||A||_F ||B||_F\).
-
3.
Submatrix \(({S_u},\bar{S_v})\). Since \(u({S_u}) = \mathbf{1}_{|S_u|}, v(\bar{S_v}) = \mathbf{0}_{|\bar{S_v}|}, d \ge 1\) and \(\epsilon < 1\),
$$\begin{aligned} e_3&= ||M({S_u},\bar{S_v})-x({S_u})y(\bar{S_v})^T||_F^2 \\&= ||M({S_u},\bar{S_v}) - \epsilon (\mathbf{{1}}_{|S_u|}+\epsilon \delta u({S_u})) \delta v(\bar{S_v})^T||_F^2 \\&= ||M({S_u},\bar{S_v}) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(\bar{S_v})^T - \epsilon ^2 \delta u({S_u}) \delta v(\bar{S_v})^T||_F^2 \\&\ge ||M({S_u},\bar{S_v}) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(\bar{S_v})^T ||_F^2 - 2\sqrt{mn}(d+1) \epsilon ^2 ||\delta u({S_u}) \delta v(\bar{S_v})^T||_F. \end{aligned}$$In fact, one can check that \(|M({S_u},\bar{S_v}) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(\bar{S_v})^T| \le d+1\) for \(\epsilon < 1\) since \(|\delta v(\bar{S_v})| \le 1\) implying that \(||M({S_u},\bar{S_v}) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(\bar{S_v})^T||_F^2 \le mn (d+1)^2\).
Because \((u,v)\) corresponds to a maximal biclique, there must be at least one \(-d\) entry in each column of \(M({S_u},\bar{S_v})\). Let us analyze each column separately. For any \(i \in \bar{S_v}\), let us note \(n_i \ge 1\) the number of \(-d\) entry in the column \(M({S_u},i)\). We have
$$\begin{aligned} ||M({S_u},i) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(i) ||_F^2&= n_i (-d- \epsilon \delta v(i))^2 + (|{S_u}|-n_i) (1-\epsilon \delta v(i))^2 \\&\ge n_i d^2 + (|{S_u}|-n_i) + 2 \epsilon \delta v(i) (n_i d - |{S_u}|+n_i) \\&= ||M({S_u},i)||_F^2 + 2 \epsilon \delta v(i) (n_i d +n_i - |{S_u}|) \\&\ge ||M({S_u},i)||_F^2 + 2 \epsilon \delta v(i). \end{aligned}$$In fact, \(n_i d \ge d \ge \max (m,n) \ge |{S_u}|\) (it is then actually sufficient to take \(d > \max (m,n)-1\)). Finally, recalling that \(\delta v(\bar{S_v}) \ge 0\) and summing on index \(i \in \bar{S_v}\), we obtain
$$\begin{aligned} e_3&\ge ||M({S_u},\bar{S_v}) - u(S_u) v(\bar{S_v})^T ||_F^2 + 2 \epsilon ||\delta v(\bar{S_v})||_1\\&- 2 \sqrt{mn}(d+1) \epsilon ^2 ||\delta u({S_u}) \delta v(\bar{S_v})^T||_F. \end{aligned}$$ -
4.
Submatrix \((\bar{S_u},{S_v})\). By symmetry, the same can be done as for the submatrix \(({S_u},\bar{S_v})\), and we have
$$\begin{aligned} e_4&= ||M(\bar{S_u},{S_v}) - x(\bar{S_u})y({S_v})^T ||_F^2 \\&\ge ||M(\bar{S_u},{S_v}) - u(\bar{S_u}) v({S_v})^T ||_F^2 + 2 \epsilon ||\delta u(\bar{S_u})||_1\\&- 2 \sqrt{mn}(d+1) \epsilon ^2 ||\delta u({S_u}) \delta v(\bar{S_v})^T||_F. \end{aligned}$$
Combining the above results and noting \(C = 2\sqrt{mn}(d+1)\), we have
Recalling that \(||x||_1 \ge ||x||_2\) for any \(x \in \mathbb{R }^n, ||xy^T||_F = ||x||_2||y||_2\) for any \(x \in \mathbb{R }^n\) and \(y \in \mathbb{R }^m\), and that \(||\delta u||_2 \le 1\) and \(||\delta v||_2 \le 1\), we have that for any \(0 < \epsilon < \frac{1}{C}\)
Finally, for any \(d \ge \max (m,n), uv^T \in B(G), 0 < \epsilon < \frac{1}{2mn(d+1)^2}, x \in \mathcal{B }_+(u, \epsilon )\) and \(y \in \mathcal{B }_+(v,\epsilon )\), we have \(||M-uv^T||_F^2 \le ||M-xy^T||_F^2\).
Appendix B: Proof of Theorem 6
Let \((u,v)\) be a nontrivial saddle point of \(\text{ R1N }_d(G)\) (hence \(uv^T \in \mathcal{S }_d(G)\)). Let us denote the (non-empty) support of \(u\) as \(K = \text{ supp }(u)\) and the (non-empty) support of \(v\) as \(L = \text{ supp }(v)\), and define \(u^{\prime } = u(K), v^{\prime } = v(L)\) and \(M^{\prime } = M(K,L)\) to be the subvectors and submatrix with indexes in \(K, L\) and \(K \times L\), respectively. Let us also define \(G^{\prime }\) as the bipartite graph whose biadjacency matrix is given by \(A(K,L)\).
Observe that \((u^{\prime },v^{\prime })\) must be a saddle point of R1N(\(G^{\prime }\)) otherwise \((u,v)\) would not be a saddle point of \(\text{ R1N }_d(G)\). In fact, the objective functions of these two problems differ only by a constant factor: we have \(||M-uv||_F^2 = ||M^{\prime }-u^{\prime }v^{\prime T}||_F^2 + ||M||_F^2-||M^{\prime }||_F^2\). By stationarity of \((u,v)\), Eq. (6) gives
Therefore, \((u^{\prime }/||u^{\prime }||_2,v^{\prime }/||v^{\prime }||_2) > 0\) defines a pair of singular vectors of \(M^{\prime }\) associated with the singular value \(||u^{\prime }||_2||v^{\prime }||_2 > 0\).
If \(M^{\prime }\) does not contain any \(-d\) entries, then \((u^{\prime },v^{\prime }) = (\mathbf{1 }_{|K|}, \mathbf{1 }_{|L|})\) is the unique pair of positive singular vectors (up to a constant factor). We then have that \(uv^T \in F(G)\). By Theorem 3, \(uv^T \in B(G) = \mathcal{L }_d(G)\) is then a local minima since \(F(G) \cap \mathcal{S }_d(G) = B(G) = \mathcal{L }_d(G)\) for any \(d \ge \max (m,n)\), a contradiction.
Therefore \(M^{\prime }\) contains at least one \(-d\) entry. By Lemma 2, any pair of singular vectors of \(M^{\prime }\) associated with the largest singular value of \(M^{\prime }\) must contain a least one non-positive entry. Therefore, \((u^{\prime },v^{\prime })\) is a pair of positive singular vectors of \(M^{\prime }\) not associated with the largest singular value of \(M^{\prime }\), i.e., it is a saddle point of R1U(\(M^{\prime }\)).
An example of such a saddle point is given in Example 1.
Rights and permissions
About this article
Cite this article
Gillis, N., Glineur, F. A continuous characterization of the maximum-edge biclique problem. J Glob Optim 58, 439–464 (2014). https://doi.org/10.1007/s10898-013-0053-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-013-0053-2