Skip to main content
Log in

A continuous characterization of the maximum-edge biclique problem

  • Published:
Journal of Global Optimization Aims and scope Submit manuscript

Abstract

The problem of finding large complete subgraphs in bipartite graphs (that is, bicliques) is a well-known combinatorial optimization problem referred to as the maximum-edge biclique problem (MBP), and has many applications, e.g., in web community discovery, biological data analysis and text mining. In this paper, we present a new continuous characterization for MBP. Given a bipartite graph \(G\), we are able to formulate a continuous optimization problem (namely, an approximate rank-one matrix factorization problem with nonnegativity constraints, R1N for short), and show that there is a one-to-one correspondence between (1) the maximum (i.e., the largest) bicliques of \(G\) and the global minima of R1N, and (2) the maximal bicliques of \(G\) (i.e., bicliques not contained in any larger biclique) and the local minima of R1N. We also show that any stationary points of R1N must be close to a biclique of \(G\). This allows us to design a new type of biclique finding algorithm based on the application of a block-coordinate descent scheme to R1N. We show that this algorithm, whose algorithmic complexity per iteration is proportional to the number of edges in the graph, is guaranteed to converge to a biclique and that it performs competitively with existing methods on random graphs and text mining datasets. Finally, we show how R1N is closely related to the Motzkin–Strauss formalism for cliques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. The first-order stationarity condition of \(\text{ R1N }_d(G)\) for variables \(v\) is given by \(v = \max \left( 0,{M^Tu}/{||u||_2^2}\right) \), see Sect. 3.3. Therefore, local and global minimizers of \(\text{ R1N }_d(G)\) must satisfy this condition, hence they exactly correspond to the local and global minimizers of the problem in the new variables \((x,y)\).

  2. By Wedin’s theorem (cf. matrix perturbation theory [17]), singular subspaces of \(M\) associated with a positive singular value depend continuously on \(d\).

  3. In practice, we used a safety procedure which reduces the value of \(d\) whenever \(u\) or \(v\) is set to zero and reinitializes \(u\) and \(v\) to their previous value.

  4. Additional tweaking of parameters MIPFocus, Heuristics, PreQLinearize, MIQCPMethod and RINS did not lead to better results.

References

  1. Alexe, G., Alexe, S., Crama, Y., Foldes, S., Hammer, P., Simeone, B.: Consensus algorithms for the generation of all maximal bicliques. Discret. Appl. Math. 145(1), 11–21 (2004)

    Article  Google Scholar 

  2. Bomze, I.: Evolution towards the maximum clique. J. Glob. Opt. 10, 143–164 (1997)

    Article  Google Scholar 

  3. Ding, C., Li, T., Jordan, M.: Nonnegative matrix factorization for combinatorial optimization: spectral clustering, graph matching, and clique finding. In: IEEE International Conference on Data Mining, pp. 183–192 (2008)

  4. Ding, C., Zhang, Y., Li, T., Holbrook, S.: Biclustering protein complex interactions with a biclique finding algorithm. In: IEEE International Conference on Data Mining, pp. 178–187 (2006)

  5. Dolan, E., Moré, J.: Benchmarking optimization software with performance profiles. Math. Prog. Ser. A 91, 201–213 (2002)

    Google Scholar 

  6. Gibbons, L., Hearn, D., Pardalos, P., Ramana, M.: Continuous characterizations of the maximum clique problem. Math. Oper. Res. 22(3), 754–768 (1997)

    Article  Google Scholar 

  7. Gillis, N.: Nonnegative Matrix Factorization: Complexity, Algorithms and Applications. Ph.D. Thesis, Université catholique de Louvain (2011)

  8. Gillis, N., Glineur, F.: Nonnegative Factorization and The Maximum Edge Biclique Problem (2008). CORE Discuss. pap. 2008/64

  9. Golub, G., Van Loan, C.: Matrix Computation, 3rd edn. The Johns Hopkins University Press, Baltimore (1996)

  10. Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gauss–Seidel method under convex constraints. Oper. Res. Lett. 26, 127–136 (2000)

    Article  Google Scholar 

  11. Gurobi Optimization, I.: Gurobi Optimizer Reference Manual (2012). http://www.gurobi.com

  12. Lehmann, S., Schwartz, M., Hansen, L.: Biclique communities. Phys. Rev. E 78(1), 016108 (2008)

    Google Scholar 

  13. Liu, G., Sim, K., Li, J.: Efficient Mining of Large Maximal Bicliques, Lect. Notes in Comput. Sci. pp. 437–448. Springer, Berlin (2006)

  14. Motzkin, T., Strauss, E.: Maxima for graphs and a new proof of a theorem of Turan. Can. J. Math. 17, 533–540 (1965)

    Article  Google Scholar 

  15. Peeters, R.: The maximum edge biclique problem is NP-complete. Discret. Appl. Math. 131(3), 651–654 (2003)

    Article  Google Scholar 

  16. Prelic, A., Bleuler, S., Zimmermann, P., Wille, A., Buhlmann, P., Gruissemb, W., Hennig, L., Thiele, L., Zitzler, E.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9), 1122–1129 (2008)

    Article  Google Scholar 

  17. Stewart, G., Sun, J.G.: Matrix Perturbation Theory. Academic Press, San Diego (1990)

  18. Zhong, S., Ghosh, J.: Generative model-based document clustering: a comparative study. Knowl. Inf. Syst. 8(3), 374–384 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicolas Gillis.

Additional information

This paper presents research results of the Belgian Network DYSCO (Dynamical Systems, Control, and Optimization), funded by the Interuniversity Attraction Poles Programme initiated by the Belgian Science Policy Office. The first author is a postdoctoral researcher with the Fonds de la Recherche Scientifique-FNRS (F.R.S.-FNRS).

Appendices

Appendix A: Proof of Theorem 1

Let us show that \(B(G) \subseteq \mathcal{L }_d(G)\) for any \(d \ge \max (m,n)\). Let \(uv^T \in B(G)\), with \(u\) and \(v\) binary without loss of generality. The binary rank-one matrix \(uv^T\) belongs to \(\mathcal{L }_d(G)\) if and only if there exists \(\epsilon > 0\) such that for all \(x \in \mathcal{B }_+(u,\epsilon )\) and \(y \in \mathcal{B }_+(v,\epsilon )\), we have \(||M-uv^T||_F^2 \le ||M-xy^T||_F^2\).

Let then \(x \in \mathcal{B }_+(u,\epsilon )\) and \(y \in \mathcal{B }_+(v,\epsilon )\), and let us note \(S_u, S_v, S_x\) and \(S_y\) the supports of \(u, v, x\) and \(y\), respectively. For \(\epsilon < 1\), since \(u\) and \(v\) are binary, we have \(S_u \subseteq S_x\) and \(S_v \subseteq S_y\) (i.e., \(u_i = 1 \Rightarrow x_i > 0\) and \(v_j = 1 \Rightarrow y_j > 0\)). This implies that for \(\epsilon < 1, ||M-uv^T||_F^2 \le ||M-xy^T||_F^2\) if and only if

$$\begin{aligned} ||M(S_x,S_y)-u(S_x)v(S_y)^T||_F^2 \le ||M(S_x,S_y)-x(S_x)y(S_y)^T||_F^2. \end{aligned}$$

Let us note \(\bar{S_u} = S_x \backslash S_u\) and \(\bar{S_v} = S_y \backslash S_v\). Since \(x \in \mathcal{B }_+(x,\epsilon )\), there exists \(\delta u\) such that \(x = u + \epsilon \delta u\) with \(||\delta u||_2 \le 1\) and \(\delta u(\bar{S_u}) \ge 0\) since \(u(\bar{S_u}) = 0\); symmetrically there exists \(\delta v\) such that \(y = v + \epsilon \delta v\) with \(||\delta v||_2 \le 1\) and \(\delta v(\bar{S_v}) \ge 0\).

Let us analyze the four submatrices of \(M(S_x,S_y)\) corresponding to the decomposition \(S_x = S_u \cup \bar{S_u}\) and \(S_x = S_u \cup \bar{S_u}\).

  1. 1.

    Submatrix \((S_u,S_v)\). Since \(M(S_u,S_v) = \mathbf{1}_{|S_u|\times |S_v|}, u({S_u}) = \mathbf{1}_{|S_u|}\) and \(v({S_v}) = \mathbf{1}_{|S_v|}\),

    $$\begin{aligned} e_1 = ||M(S_u,S_v)-x(S_u)y(S_v)^T||_F^2 \ge ||M(S_u,S_v)-u(S_u)v(S_v)^T||_F^2 = 0. \end{aligned}$$
  2. 2.

    Submatrix \((\bar{S_u},\bar{S_v})\). Since \(u(\bar{S_u}) = 0, v(\bar{S_v}) = 0\) and \(||M(\bar{S_u},\bar{S_v})||_F^2 \le |\bar{S_u}||\bar{S_v}|d^2 \le mnd^2\) for \(d \ge 1\),

    $$\begin{aligned}&e_2 = ||M(\bar{S_u},\bar{S_v})-x(\bar{S_u})y(\bar{S_v})^T||_F^2 = ||M(\bar{S_u},\bar{S_v}) - \epsilon ^2 \delta u(\bar{S_u}) \delta v(\bar{S_v})^T||_F^2 \\&\quad ||\delta u(\bar{S_u}) \delta v(\bar{S_v})^T||_F^2. \end{aligned}$$

    In fact, recall that \(||A-B||_F^2 = ||A||_F^2 - 2 \sum _{ij} A_{ij} B_{ij} + ||B||_F^2 \ge ||A||_F^2 - 2 ||A||_F ||B||_F\).

  3. 3.

    Submatrix \(({S_u},\bar{S_v})\). Since \(u({S_u}) = \mathbf{1}_{|S_u|}, v(\bar{S_v}) = \mathbf{0}_{|\bar{S_v}|}, d \ge 1\) and \(\epsilon < 1\),

    $$\begin{aligned} e_3&= ||M({S_u},\bar{S_v})-x({S_u})y(\bar{S_v})^T||_F^2 \\&= ||M({S_u},\bar{S_v}) - \epsilon (\mathbf{{1}}_{|S_u|}+\epsilon \delta u({S_u})) \delta v(\bar{S_v})^T||_F^2 \\&= ||M({S_u},\bar{S_v}) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(\bar{S_v})^T - \epsilon ^2 \delta u({S_u}) \delta v(\bar{S_v})^T||_F^2 \\&\ge ||M({S_u},\bar{S_v}) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(\bar{S_v})^T ||_F^2 - 2\sqrt{mn}(d+1) \epsilon ^2 ||\delta u({S_u}) \delta v(\bar{S_v})^T||_F. \end{aligned}$$

    In fact, one can check that \(|M({S_u},\bar{S_v}) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(\bar{S_v})^T| \le d+1\) for \(\epsilon < 1\) since \(|\delta v(\bar{S_v})| \le 1\) implying that \(||M({S_u},\bar{S_v}) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(\bar{S_v})^T||_F^2 \le mn (d+1)^2\).

    Because \((u,v)\) corresponds to a maximal biclique, there must be at least one \(-d\) entry in each column of \(M({S_u},\bar{S_v})\). Let us analyze each column separately. For any \(i \in \bar{S_v}\), let us note \(n_i \ge 1\) the number of \(-d\) entry in the column \(M({S_u},i)\). We have

    $$\begin{aligned} ||M({S_u},i) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(i) ||_F^2&= n_i (-d- \epsilon \delta v(i))^2 + (|{S_u}|-n_i) (1-\epsilon \delta v(i))^2 \\&\ge n_i d^2 + (|{S_u}|-n_i) + 2 \epsilon \delta v(i) (n_i d - |{S_u}|+n_i) \\&= ||M({S_u},i)||_F^2 + 2 \epsilon \delta v(i) (n_i d +n_i - |{S_u}|) \\&\ge ||M({S_u},i)||_F^2 + 2 \epsilon \delta v(i). \end{aligned}$$

    In fact, \(n_i d \ge d \ge \max (m,n) \ge |{S_u}|\) (it is then actually sufficient to take \(d > \max (m,n)-1\)). Finally, recalling that \(\delta v(\bar{S_v}) \ge 0\) and summing on index \(i \in \bar{S_v}\), we obtain

    $$\begin{aligned} e_3&\ge ||M({S_u},\bar{S_v}) - u(S_u) v(\bar{S_v})^T ||_F^2 + 2 \epsilon ||\delta v(\bar{S_v})||_1\\&- 2 \sqrt{mn}(d+1) \epsilon ^2 ||\delta u({S_u}) \delta v(\bar{S_v})^T||_F. \end{aligned}$$
  4. 4.

    Submatrix \((\bar{S_u},{S_v})\). By symmetry, the same can be done as for the submatrix \(({S_u},\bar{S_v})\), and we have

    $$\begin{aligned} e_4&= ||M(\bar{S_u},{S_v}) - x(\bar{S_u})y({S_v})^T ||_F^2 \\&\ge ||M(\bar{S_u},{S_v}) - u(\bar{S_u}) v({S_v})^T ||_F^2 + 2 \epsilon ||\delta u(\bar{S_u})||_1\\&- 2 \sqrt{mn}(d+1) \epsilon ^2 ||\delta u({S_u}) \delta v(\bar{S_v})^T||_F. \end{aligned}$$

Combining the above results and noting \(C = 2\sqrt{mn}(d+1)\), we have

$$\begin{aligned} e_{T}&= e_1+e_2+e_3+e_4 \\&= ||M({S_x},{S_y}) - x({S_x})y({S_y})^T ||_F^2 \\&\ge ||M({S_x},{S_y}) - u({S_x})u({S_y})^T ||_F^2 + 2 \epsilon ||\delta u(\bar{S_u})||_1 + 2 \epsilon ||\delta v(\bar{S_v})||_1 \\&- C \epsilon ^2 ||\delta u(\bar{S_u}) \delta v(\bar{S_v})^T||_F^2 - C \epsilon ^2 (||\delta u(\bar{S_u}) \delta v({S_v})^T||_F^2+||\delta u({S_u}) \delta v(\bar{S_v})^T||_F^2). \end{aligned}$$

Recalling that \(||x||_1 \ge ||x||_2\) for any \(x \in \mathbb{R }^n, ||xy^T||_F = ||x||_2||y||_2\) for any \(x \in \mathbb{R }^n\) and \(y \in \mathbb{R }^m\), and that \(||\delta u||_2 \le 1\) and \(||\delta v||_2 \le 1\), we have that for any \(0 < \epsilon < \frac{1}{C}\)

$$\begin{aligned} e_{T}&\ge ||M({S_x},{S_y}) - u({S_x})u({S_y})^T ||_F^2 \\&+ \epsilon ||\delta u(\bar{S_u})||_2^{\frac{1}{2}} \left( {2}{} - C \epsilon ||\delta u(\bar{S_u})||_2^{\frac{1}{2}} ||\delta v(\bar{S_v})^T||_2 - C \epsilon ||\delta u(\bar{S_u})||_2^{\frac{1}{2}} ||\delta v({S_v})^T||_2 \right) \\&+ \epsilon ||\delta v(\bar{S_v})||_2^{\frac{1}{2}} \left( {2}{} - C\epsilon ||\delta v(\bar{S_v})||_2^{\frac{1}{2}} ||\delta u(\bar{S_u})^T||_2 - C \epsilon ||\delta v(\bar{S_v})||_2^{\frac{1}{2}} ||\delta u({S_u})^T||_2 \right) \\&\ge ||M({S_x},{S_y}) - u({S_x})u({S_y})^T ||_F^2 + 2 \epsilon (1-C \epsilon ) (||\delta u(\bar{S_u})||_2^{\frac{1}{2}}+||\delta v(\bar{S_v})||_2^{\frac{1}{2}}) \\&\ge ||M({S_x},{S_y}) - u({S_x})v({S_y})^T ||_F^2. \end{aligned}$$

Finally, for any \(d \ge \max (m,n), uv^T \in B(G), 0 < \epsilon < \frac{1}{2mn(d+1)^2}, x \in \mathcal{B }_+(u, \epsilon )\) and \(y \in \mathcal{B }_+(v,\epsilon )\), we have \(||M-uv^T||_F^2 \le ||M-xy^T||_F^2\).

Appendix B: Proof of Theorem 6

Let \((u,v)\) be a nontrivial saddle point of \(\text{ R1N }_d(G)\) (hence \(uv^T \in \mathcal{S }_d(G)\)). Let us denote the (non-empty) support of \(u\) as \(K = \text{ supp }(u)\) and the (non-empty) support of \(v\) as \(L = \text{ supp }(v)\), and define \(u^{\prime } = u(K), v^{\prime } = v(L)\) and \(M^{\prime } = M(K,L)\) to be the subvectors and submatrix with indexes in \(K, L\) and \(K \times L\), respectively. Let us also define \(G^{\prime }\) as the bipartite graph whose biadjacency matrix is given by \(A(K,L)\).

Observe that \((u^{\prime },v^{\prime })\) must be a saddle point of R1N(\(G^{\prime }\)) otherwise \((u,v)\) would not be a saddle point of \(\text{ R1N }_d(G)\). In fact, the objective functions of these two problems differ only by a constant factor: we have \(||M-uv||_F^2 = ||M^{\prime }-u^{\prime }v^{\prime T}||_F^2 + ||M||_F^2-||M^{\prime }||_F^2\). By stationarity of \((u,v)\), Eq. (6) gives

$$\begin{aligned} u^{\prime } = \frac{M^{\prime }v^{\prime }}{||v^{\prime }||_2^2} \quad \text{ and } \quad v^{\prime } = \frac{M^{\prime T}u^{\prime }}{||u^{\prime }||_2^2}. \end{aligned}$$

Therefore, \((u^{\prime }/||u^{\prime }||_2,v^{\prime }/||v^{\prime }||_2) > 0\) defines a pair of singular vectors of \(M^{\prime }\) associated with the singular value \(||u^{\prime }||_2||v^{\prime }||_2 > 0\).

If \(M^{\prime }\) does not contain any \(-d\) entries, then \((u^{\prime },v^{\prime }) = (\mathbf{1 }_{|K|}, \mathbf{1 }_{|L|})\) is the unique pair of positive singular vectors (up to a constant factor). We then have that \(uv^T \in F(G)\). By Theorem 3, \(uv^T \in B(G) = \mathcal{L }_d(G)\) is then a local minima since \(F(G) \cap \mathcal{S }_d(G) = B(G) = \mathcal{L }_d(G)\) for any \(d \ge \max (m,n)\), a contradiction.

Therefore \(M^{\prime }\) contains at least one \(-d\) entry. By Lemma 2, any pair of singular vectors of \(M^{\prime }\) associated with the largest singular value of \(M^{\prime }\) must contain a least one non-positive entry. Therefore, \((u^{\prime },v^{\prime })\) is a pair of positive singular vectors of \(M^{\prime }\) not associated with the largest singular value of \(M^{\prime }\), i.e., it is a saddle point of R1U(\(M^{\prime }\)).

An example of such a saddle point is given in Example 1.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gillis, N., Glineur, F. A continuous characterization of the maximum-edge biclique problem. J Glob Optim 58, 439–464 (2014). https://doi.org/10.1007/s10898-013-0053-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10898-013-0053-2

Keywords

Navigation