A continuous characterization of the maximum-edge biclique problem

Gillis, Nicolas; Glineur, François

doi:10.1007/s10898-013-0053-2

A continuous characterization of the maximum-edge biclique problem

Published: 20 March 2013

Volume 58, pages 439–464, (2014)
Cite this article

Journal of Global Optimization Aims and scope Submit manuscript

Nicolas Gillis¹ &
François Glineur^1,2

569 Accesses
15 Citations
Explore all metrics

Abstract

The problem of finding large complete subgraphs in bipartite graphs (that is, bicliques) is a well-known combinatorial optimization problem referred to as the maximum-edge biclique problem (MBP), and has many applications, e.g., in web community discovery, biological data analysis and text mining. In this paper, we present a new continuous characterization for MBP. Given a bipartite graph $G$, we are able to formulate a continuous optimization problem (namely, an approximate rank-one matrix factorization problem with nonnegativity constraints, R1N for short), and show that there is a one-to-one correspondence between (1) the maximum (i.e., the largest) bicliques of $G$ and the global minima of R1N, and (2) the maximal bicliques of $G$ (i.e., bicliques not contained in any larger biclique) and the local minima of R1N. We also show that any stationary points of R1N must be close to a biclique of $G$. This allows us to design a new type of biclique finding algorithm based on the application of a block-coordinate descent scheme to R1N. We show that this algorithm, whose algorithmic complexity per iteration is proportional to the number of edges in the graph, is guaranteed to converge to a biclique and that it performs competitively with existing methods on random graphs and text mining datasets. Finally, we show how R1N is closely related to the Motzkin–Strauss formalism for cliques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bipartite communities via spectral partitioning

Article 25 April 2020

Eigenvalue, quadratic programming, and semidefinite programming relaxations for a cut minimization problem

Article 21 August 2015

Bipartite Communities via Spectral Partitioning

Notes

The first-order stationarity condition of $\text{ R1N }_d(G)$ for variables $v$ is given by $v = \max \left( 0,{M^Tu}/{||u||_2^2}\right) $, see Sect. 3.3. Therefore, local and global minimizers of $\text{ R1N }_d(G)$ must satisfy this condition, hence they exactly correspond to the local and global minimizers of the problem in the new variables $(x,y)$.
By Wedin’s theorem (cf. matrix perturbation theory [17]), singular subspaces of $M$ associated with a positive singular value depend continuously on $d$.
In practice, we used a safety procedure which reduces the value of $d$ whenever $u$ or $v$ is set to zero and reinitializes $u$ and $v$ to their previous value.
Additional tweaking of parameters MIPFocus, Heuristics, PreQLinearize, MIQCPMethod and RINS did not lead to better results.

References

Alexe, G., Alexe, S., Crama, Y., Foldes, S., Hammer, P., Simeone, B.: Consensus algorithms for the generation of all maximal bicliques. Discret. Appl. Math. 145(1), 11–21 (2004)
Article Google Scholar
Bomze, I.: Evolution towards the maximum clique. J. Glob. Opt. 10, 143–164 (1997)
Article Google Scholar
Ding, C., Li, T., Jordan, M.: Nonnegative matrix factorization for combinatorial optimization: spectral clustering, graph matching, and clique finding. In: IEEE International Conference on Data Mining, pp. 183–192 (2008)
Ding, C., Zhang, Y., Li, T., Holbrook, S.: Biclustering protein complex interactions with a biclique finding algorithm. In: IEEE International Conference on Data Mining, pp. 178–187 (2006)
Dolan, E., Moré, J.: Benchmarking optimization software with performance profiles. Math. Prog. Ser. A 91, 201–213 (2002)
Google Scholar
Gibbons, L., Hearn, D., Pardalos, P., Ramana, M.: Continuous characterizations of the maximum clique problem. Math. Oper. Res. 22(3), 754–768 (1997)
Article Google Scholar
Gillis, N.: Nonnegative Matrix Factorization: Complexity, Algorithms and Applications. Ph.D. Thesis, Université catholique de Louvain (2011)
Gillis, N., Glineur, F.: Nonnegative Factorization and The Maximum Edge Biclique Problem (2008). CORE Discuss. pap. 2008/64
Golub, G., Van Loan, C.: Matrix Computation, 3rd edn. The Johns Hopkins University Press, Baltimore (1996)
Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gauss–Seidel method under convex constraints. Oper. Res. Lett. 26, 127–136 (2000)
Article Google Scholar
Gurobi Optimization, I.: Gurobi Optimizer Reference Manual (2012). http://www.gurobi.com
Lehmann, S., Schwartz, M., Hansen, L.: Biclique communities. Phys. Rev. E 78(1), 016108 (2008)
Google Scholar
Liu, G., Sim, K., Li, J.: Efficient Mining of Large Maximal Bicliques, Lect. Notes in Comput. Sci. pp. 437–448. Springer, Berlin (2006)
Motzkin, T., Strauss, E.: Maxima for graphs and a new proof of a theorem of Turan. Can. J. Math. 17, 533–540 (1965)
Article Google Scholar
Peeters, R.: The maximum edge biclique problem is NP-complete. Discret. Appl. Math. 131(3), 651–654 (2003)
Article Google Scholar
Prelic, A., Bleuler, S., Zimmermann, P., Wille, A., Buhlmann, P., Gruissemb, W., Hennig, L., Thiele, L., Zitzler, E.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9), 1122–1129 (2008)
Article Google Scholar
Stewart, G., Sun, J.G.: Matrix Perturbation Theory. Academic Press, San Diego (1990)
Zhong, S., Ghosh, J.: Generative model-based document clustering: a comparative study. Knowl. Inf. Syst. 8(3), 374–384 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

ICTEAM Institute, Université catholique de Louvain, 1348 , Louvain-la-Neuve, Belgium
Nicolas Gillis & François Glineur
Center for Operations Research and Econometrics, Université catholique de Louvain, Voie du Roman Pays, 34, 1348 , Louvain-la-Neuve, Belgium
François Glineur

Authors

Nicolas Gillis
View author publications
You can also search for this author in PubMed Google Scholar
François Glineur
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nicolas Gillis.

Additional information

This paper presents research results of the Belgian Network DYSCO (Dynamical Systems, Control, and Optimization), funded by the Interuniversity Attraction Poles Programme initiated by the Belgian Science Policy Office. The first author is a postdoctoral researcher with the Fonds de la Recherche Scientifique-FNRS (F.R.S.-FNRS).

Appendices

Appendix A: Proof of Theorem 1

Let us show that $B(G) \subseteq \mathcal{L }_d(G)$ for any $d \ge \max (m,n)$. Let $uv^T \in B(G)$, with $u$ and $v$ binary without loss of generality. The binary rank-one matrix $uv^T$ belongs to $\mathcal{L }_d(G)$ if and only if there exists $\epsilon > 0$ such that for all $x \in \mathcal{B }_+(u,\epsilon )$ and $y \in \mathcal{B }_+(v,\epsilon )$, we have $||M-uv^T||_F^2 \le ||M-xy^T||_F^2$.

Let then $x \in \mathcal{B }_+(u,\epsilon )$ and $y \in \mathcal{B }_+(v,\epsilon )$, and let us note $S_u, S_v, S_x$ and $S_y$ the supports of $u, v, x$ and $y$, respectively. For $\epsilon < 1$, since $u$ and $v$ are binary, we have $S_u \subseteq S_x$ and $S_v \subseteq S_y$ (i.e., $u_i = 1 \Rightarrow x_i > 0$ and $v_j = 1 \Rightarrow y_j > 0$). This implies that for $\epsilon < 1, ||M-uv^T||_F^2 \le ||M-xy^T||_F^2$ if and only if

$$\begin{aligned} ||M(S_x,S_y)-u(S_x)v(S_y)^T||_F^2 \le ||M(S_x,S_y)-x(S_x)y(S_y)^T||_F^2. \end{aligned}$$

Let us note $\bar{S_u} = S_x \backslash S_u$ and $\bar{S_v} = S_y \backslash S_v$. Since $x \in \mathcal{B }_+(x,\epsilon )$, there exists $\delta u$ such that $x = u + \epsilon \delta u$ with $||\delta u||_2 \le 1$ and $\delta u(\bar{S_u}) \ge 0$ since $u(\bar{S_u}) = 0$; symmetrically there exists $\delta v$ such that $y = v + \epsilon \delta v$ with $||\delta v||_2 \le 1$ and $\delta v(\bar{S_v}) \ge 0$.

Let us analyze the four submatrices of $M(S_x,S_y)$ corresponding to the decomposition $S_x = S_u \cup \bar{S_u}$ and $S_x = S_u \cup \bar{S_u}$.

1.
Submatrix $(S_u,S_v)$. Since $M(S_u,S_v) = \mathbf{1}_{|S_u|\times |S_v|}, u({S_u}) = \mathbf{1}_{|S_u|}$ and $v({S_v}) = \mathbf{1}_{|S_v|}$,
$$\begin{aligned} e_1 = ||M(S_u,S_v)-x(S_u)y(S_v)^T||_F^2 \ge ||M(S_u,S_v)-u(S_u)v(S_v)^T||_F^2 = 0. \end{aligned}$$
2.
Submatrix $(\bar{S_u},\bar{S_v})$. Since $u(\bar{S_u}) = 0, v(\bar{S_v}) = 0$ and $||M(\bar{S_u},\bar{S_v})||_F^2 \le |\bar{S_u}||\bar{S_v}|d^2 \le mnd^2$ for $d \ge 1$,
$$\begin{aligned}&e_2 = ||M(\bar{S_u},\bar{S_v})-x(\bar{S_u})y(\bar{S_v})^T||_F^2 = ||M(\bar{S_u},\bar{S_v}) - \epsilon ^2 \delta u(\bar{S_u}) \delta v(\bar{S_v})^T||_F^2 \\&\quad ||\delta u(\bar{S_u}) \delta v(\bar{S_v})^T||_F^2. \end{aligned}$$
In fact, recall that $||A-B||_F^2 = ||A||_F^2 - 2 \sum _{ij} A_{ij} B_{ij} + ||B||_F^2 \ge ||A||_F^2 - 2 ||A||_F ||B||_F$.
3.
Submatrix $({S_u},\bar{S_v})$. Since $u({S_u}) = \mathbf{1}_{|S_u|}, v(\bar{S_v}) = \mathbf{0}_{|\bar{S_v}|}, d \ge 1$ and $\epsilon < 1$,
$$\begin{aligned} e_3&= ||M({S_u},\bar{S_v})-x({S_u})y(\bar{S_v})^T||_F^2 \\&= ||M({S_u},\bar{S_v}) - \epsilon (\mathbf{{1}}_{|S_u|}+\epsilon \delta u({S_u})) \delta v(\bar{S_v})^T||_F^2 \\&= ||M({S_u},\bar{S_v}) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(\bar{S_v})^T - \epsilon ^2 \delta u({S_u}) \delta v(\bar{S_v})^T||_F^2 \\&\ge ||M({S_u},\bar{S_v}) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(\bar{S_v})^T ||_F^2 - 2\sqrt{mn}(d+1) \epsilon ^2 ||\delta u({S_u}) \delta v(\bar{S_v})^T||_F. \end{aligned}$$
In fact, one can check that $|M({S_u},\bar{S_v}) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(\bar{S_v})^T| \le d+1$ for $\epsilon < 1$ since $|\delta v(\bar{S_v})| \le 1$ implying that $||M({S_u},\bar{S_v}) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(\bar{S_v})^T||_F^2 \le mn (d+1)^2$.

Because $(u,v)$ corresponds to a maximal biclique, there must be at least one $-d$ entry in each column of $M({S_u},\bar{S_v})$. Let us analyze each column separately. For any $i \in \bar{S_v}$, let us note $n_i \ge 1$ the number of $-d$ entry in the column $M({S_u},i)$. We have
$$\begin{aligned} ||M({S_u},i) - \epsilon \mathbf{{1}}_{|S_u|} \delta v(i) ||_F^2&= n_i (-d- \epsilon \delta v(i))^2 + (|{S_u}|-n_i) (1-\epsilon \delta v(i))^2 \\&\ge n_i d^2 + (|{S_u}|-n_i) + 2 \epsilon \delta v(i) (n_i d - |{S_u}|+n_i) \\&= ||M({S_u},i)||_F^2 + 2 \epsilon \delta v(i) (n_i d +n_i - |{S_u}|) \\&\ge ||M({S_u},i)||_F^2 + 2 \epsilon \delta v(i). \end{aligned}$$
In fact, $n_i d \ge d \ge \max (m,n) \ge |{S_u}|$ (it is then actually sufficient to take $d > \max (m,n)-1$). Finally, recalling that $\delta v(\bar{S_v}) \ge 0$ and summing on index $i \in \bar{S_v}$, we obtain
$$\begin{aligned} e_3&\ge ||M({S_u},\bar{S_v}) - u(S_u) v(\bar{S_v})^T ||_F^2 + 2 \epsilon ||\delta v(\bar{S_v})||_1\\&- 2 \sqrt{mn}(d+1) \epsilon ^2 ||\delta u({S_u}) \delta v(\bar{S_v})^T||_F. \end{aligned}$$
4.
Submatrix $(\bar{S_u},{S_v})$. By symmetry, the same can be done as for the submatrix $({S_u},\bar{S_v})$, and we have
$$\begin{aligned} e_4&= ||M(\bar{S_u},{S_v}) - x(\bar{S_u})y({S_v})^T ||_F^2 \\&\ge ||M(\bar{S_u},{S_v}) - u(\bar{S_u}) v({S_v})^T ||_F^2 + 2 \epsilon ||\delta u(\bar{S_u})||_1\\&- 2 \sqrt{mn}(d+1) \epsilon ^2 ||\delta u({S_u}) \delta v(\bar{S_v})^T||_F. \end{aligned}$$

Combining the above results and noting $C = 2\sqrt{mn}(d+1)$, we have

$$\begin{aligned} e_{T}&= e_1+e_2+e_3+e_4 \\&= ||M({S_x},{S_y}) - x({S_x})y({S_y})^T ||_F^2 \\&\ge ||M({S_x},{S_y}) - u({S_x})u({S_y})^T ||_F^2 + 2 \epsilon ||\delta u(\bar{S_u})||_1 + 2 \epsilon ||\delta v(\bar{S_v})||_1 \\&- C \epsilon ^2 ||\delta u(\bar{S_u}) \delta v(\bar{S_v})^T||_F^2 - C \epsilon ^2 (||\delta u(\bar{S_u}) \delta v({S_v})^T||_F^2+||\delta u({S_u}) \delta v(\bar{S_v})^T||_F^2). \end{aligned}$$

Recalling that $||x||_1 \ge ||x||_2$ for any $x \in \mathbb{R }^n, ||xy^T||_F = ||x||_2||y||_2$ for any $x \in \mathbb{R }^n$ and $y \in \mathbb{R }^m$, and that $||\delta u||_2 \le 1$ and $||\delta v||_2 \le 1$, we have that for any $0 < \epsilon < \frac{1}{C}$

$$\begin{aligned} e_{T}&\ge ||M({S_x},{S_y}) - u({S_x})u({S_y})^T ||_F^2 \\&+ \epsilon ||\delta u(\bar{S_u})||_2^{\frac{1}{2}} \left( {2}{} - C \epsilon ||\delta u(\bar{S_u})||_2^{\frac{1}{2}} ||\delta v(\bar{S_v})^T||_2 - C \epsilon ||\delta u(\bar{S_u})||_2^{\frac{1}{2}} ||\delta v({S_v})^T||_2 \right) \\&+ \epsilon ||\delta v(\bar{S_v})||_2^{\frac{1}{2}} \left( {2}{} - C\epsilon ||\delta v(\bar{S_v})||_2^{\frac{1}{2}} ||\delta u(\bar{S_u})^T||_2 - C \epsilon ||\delta v(\bar{S_v})||_2^{\frac{1}{2}} ||\delta u({S_u})^T||_2 \right) \\&\ge ||M({S_x},{S_y}) - u({S_x})u({S_y})^T ||_F^2 + 2 \epsilon (1-C \epsilon ) (||\delta u(\bar{S_u})||_2^{\frac{1}{2}}+||\delta v(\bar{S_v})||_2^{\frac{1}{2}}) \\&\ge ||M({S_x},{S_y}) - u({S_x})v({S_y})^T ||_F^2. \end{aligned}$$

Finally, for any $d \ge \max (m,n), uv^T \in B(G), 0 < \epsilon < \frac{1}{2mn(d+1)^2}, x \in \mathcal{B }_+(u, \epsilon )$ and $y \in \mathcal{B }_+(v,\epsilon )$, we have $||M-uv^T||_F^2 \le ||M-xy^T||_F^2$.

Appendix B: Proof of Theorem 6

Let $(u,v)$ be a nontrivial saddle point of $\text{ R1N }_d(G)$ (hence $uv^T \in \mathcal{S }_d(G)$). Let us denote the (non-empty) support of $u$ as $K = \text{ supp }(u)$ and the (non-empty) support of $v$ as $L = \text{ supp }(v)$, and define $u^{\prime } = u(K), v^{\prime } = v(L)$ and $M^{\prime } = M(K,L)$ to be the subvectors and submatrix with indexes in $K, L$ and $K \times L$, respectively. Let us also define $G^{\prime }$ as the bipartite graph whose biadjacency matrix is given by $A(K,L)$.

Observe that $(u^{\prime },v^{\prime })$ must be a saddle point of R1N($G^{\prime }$) otherwise $(u,v)$ would not be a saddle point of $\text{ R1N }_d(G)$. In fact, the objective functions of these two problems differ only by a constant factor: we have $||M-uv||_F^2 = ||M^{\prime }-u^{\prime }v^{\prime T}||_F^2 + ||M||_F^2-||M^{\prime }||_F^2$. By stationarity of $(u,v)$, Eq. (6) gives

$$\begin{aligned} u^{\prime } = \frac{M^{\prime }v^{\prime }}{||v^{\prime }||_2^2} \quad \text{ and } \quad v^{\prime } = \frac{M^{\prime T}u^{\prime }}{||u^{\prime }||_2^2}. \end{aligned}$$

Therefore, $(u^{\prime }/||u^{\prime }||_2,v^{\prime }/||v^{\prime }||_2) > 0$ defines a pair of singular vectors of $M^{\prime }$ associated with the singular value $||u^{\prime }||_2||v^{\prime }||_2 > 0$.

If $M^{\prime }$ does not contain any $-d$ entries, then $(u^{\prime },v^{\prime }) = (\mathbf{1 }_{|K|}, \mathbf{1 }_{|L|})$ is the unique pair of positive singular vectors (up to a constant factor). We then have that $uv^T \in F(G)$. By Theorem 3, $uv^T \in B(G) = \mathcal{L }_d(G)$ is then a local minima since $F(G) \cap \mathcal{S }_d(G) = B(G) = \mathcal{L }_d(G)$ for any $d \ge \max (m,n)$, a contradiction.

Therefore $M^{\prime }$ contains at least one $-d$ entry. By Lemma 2, any pair of singular vectors of $M^{\prime }$ associated with the largest singular value of $M^{\prime }$ must contain a least one non-positive entry. Therefore, $(u^{\prime },v^{\prime })$ is a pair of positive singular vectors of $M^{\prime }$ not associated with the largest singular value of $M^{\prime }$, i.e., it is a saddle point of R1U($M^{\prime }$).

An example of such a saddle point is given in Example 1.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gillis, N., Glineur, F. A continuous characterization of the maximum-edge biclique problem. J Glob Optim 58, 439–464 (2014). https://doi.org/10.1007/s10898-013-0053-2

Download citation

Received: 11 May 2012
Accepted: 27 February 2013
Published: 20 March 2013
Issue Date: March 2014
DOI: https://doi.org/10.1007/s10898-013-0053-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A continuous characterization of the maximum-edge biclique problem

Abstract

Access this article

Similar content being viewed by others

Bipartite communities via spectral partitioning

Eigenvalue, quadratic programming, and semidefinite programming relaxations for a cut minimization problem

Bipartite Communities via Spectral Partitioning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Proof of Theorem 1

Appendix B: Proof of Theorem 6

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A continuous characterization of the maximum-edge biclique problem

Abstract

Access this article

Similar content being viewed by others

Bipartite communities via spectral partitioning

Eigenvalue, quadratic programming, and semidefinite programming relaxations for a cut minimization problem

Bipartite Communities via Spectral Partitioning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Proof of Theorem 1

Appendix B: Proof of Theorem 6

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation