Abstract
Biclustering is a machine learning problem that deals with simultaneously clustering of rows and columns of a data matrix. Complex structures of the data matrix such as overlapping biclusters have challenged existing methods. In this paper, we first provide a unified formulation of biclustering that uses structured regularized matrix decomposition, which synthesizes various existing methods, and then develop a new biclustering method called BCEL based on this formulation. The biclustering problem is formulated as a penalized least-squares problem that approximates the data matrix \(\mathbf {X}\) by a multiplicative matrix decomposition \(\mathbf {U}\mathbf {V}^T\) with sparse columns in both \(\mathbf {U}\) and \(\mathbf {V}\). The squared \(\ell _{1,2}\)-norm penalty, also called the exclusive Lasso penalty, is applied to both \(\mathbf {U}\) and \(\mathbf {V}\) to assist identification of rows and columns included in the biclusters. The penalized least-squares problem is solved by a novel computational algorithm that combines alternating minimization and the proximal gradient method. A subsampling based procedure called stability selection is developed to select the tuning parameters and determine the bicluster membership. BCEL is shown to be competitive to existing methods in simulation studies and an application to a real-world single-cell RNA sequencing dataset.
Similar content being viewed by others
References
Asgarian, N., Greiner, R.: Using rank-1 biclusters to classify microarray data. Dept Computing Science, and the Alberta Ingenuity Center for Machine Learning, Univ Alberta, Edmonton, AB, Canada, T6G2E8 (2006)
Beck, A.: On the convergence of alternating minimization for convex programming with applications to iteratively reweighted least squares and decomposition schemes. SIAM J. Optim. 25(1), 185–209 (2015)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Ben-Dor, A., Chor, B., Karp, R., Yakhini, Z.: Discovering local structure in gene expression data: the order-preserving submatrix problem. J. Comput. Biol. 10(3–4), 373–384 (2003)
Bergmann, S., Ihmels, J., Barkai, N.: Iterative signature algorithm for the analysis of large-scale gene expression data. Phys. Rev. E 67(3), 031902 (2003)
Campbell, F., Allen, G.I., et al.: Within group variable selection through the exclusive lasso. Electron. J. Stat. 11(2), 4220–4257 (2017)
Chen, K., Chan, K.S., Stenseth, N.C.: Reduced rank stochastic regression with a sparse singular value decomposition. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 74(2), 203–221 (2012)
Chi, E.C., Allen, G.I., Baraniuk, R.G.: Convex biclustering. Biometrics 73(1), 10–19 (2017)
Corneli, M., Bouveyron, C., Latouche, P.: Co-clustering of ordinal data via latent continuous random variables and not missing at random entries. J. Comput. Graph. Stat. 29(4), 771–785 (2020)
Gao, C., Lu, Y., Ma, Z., Zhou, H.H.: Optimal estimation and completion of matrices with biclustering structures. J. Mach. Learn. Res. 17(1), 5602–5630 (2016)
Govaert, G., Nadif, M.: Block clustering with bernoulli mixture models: comparison of different approaches. Comput. Stat. Data Anal. 52(6), 3233–3245 (2008)
Hartigan, J.A.: Direct clustering of a data matrix. J. Am. Stat. Assoc. 67(337), 123–129 (1972)
Hochreiter, S., Bodenhofer, U., Heusel, M., Mayr, A., Mitterecker, A., Kasim, A., Khamiakova, T., Van Sanden, S., Lin, D., Talloen, W., et al.: Fabia: factor analysis for bicluster acquisition. Bioinformatics 26(12), 1520–1527 (2010)
Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5(Nov), 1457–1469 (2004)
Hunter, D.R., Lange, K.: A tutorial on mm algorithms. Am. Stat. 58(1), 30–37 (2004)
Keribin, C., Brault, V., Celeux, G., Govaert, G.: Estimation and selection for the latent block model on categorical data. Stat. Comput. 25(6), 1201–1216 (2015)
Kong, D., Fujimaki, R., Liu, J., Nie, F., Ding, C.: Exclusive feature learning on arbitrary structures via \(\ell _{1,2}\)-norm. In: Advances in Neural Information Processing Systems, pp. 1655–1663 (2014)
Lazzeroni, L., Owen, A.: Plaid models for gene expression data. Statistica Sinica 12, 61–86 (2002)
Lee, M., Shen, H., Huang, J.Z., Marron, J.: Biclustering via sparse singular value decomposition. Biometrics 66(4), 1087–1095 (2010)
Meinshausen, N., Bühlmann, P.: Stability selection. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 72(4), 417–473 (2010)
Murali, T., Kasif, S.: Extracting conserved gene expression motifs from gene expression data. In: Biocomputing 2003, World Scientific, pp. 77–88 (2002)
Padilha, V.A., Campello, R.J.: A systematic comparative evaluation of biclustering techniques. BMC Bioinform. 18(1), 1–25 (2017)
Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1(3), 127–239 (2014)
Pontes, B., Giráldez, R., Aguilar-Ruiz, J.S.: Biclustering on expression data: a review. J. Biomed. Inform. 57, 163–180 (2015)
Prelić, A., Bleuler, S., Zimmermann, P., Wille, A., Bühlmann, P., Gruissem, W., Hennig, L., Thiele, L., Zitzler, E.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9), 1122–1129 (2006)
Qi, X., Luo, R., Zhao, H.: Sparse principal component analysis by choice of norm. J. Multivar. Anal. 114, 127–160 (2013)
Shabalin, A.A., Weigman, V.J., Perou, C.M., Nobel, A.B., et al.: Finding large average submatrices in high dimensional data. Ann. Appl. Stat. 3(3), 985–1012 (2009)
Sill, M., Kaiser, S., Benner, A., Kopp-Schneider, A.: Robust biclustering by sparse singular value decomposition incorporating stability selection. Bioinformatics 27(15), 2089–2097 (2011)
Tan, K.M., Witten, D.M.: Sparse biclustering of transposable data. J. Comput. Graph. Stat. 23(4), 985–1008 (2014)
Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18(suppl-1), S136–S144 (2002)
Witten, D.M., Tibshirani, R., Hastie, T.: A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10(3), 515–534 (2009)
Ximerakis, M., Lipnick, S.L., Innes, B.T., Simmons, S.K., Adiconis, X., Dionne, D., Mayweather, B.A., Nguyen, L., Niziolek, Z., Ozek, C., et al.: Single-cell transcriptomic profiling of the aging mouse brain. Nat. Neurosci. 22(10), 1696–1708 (2019)
Yang, J., Wang, H., Wang, W., Yu, P.: Enhanced biclustering on expression data. In: Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings. IEEE, pp. 321–327 (2003)
Zaki, M.J., Meira, W., Jr., Meira, W.: Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press (2014)
Zhao, P., Rocha, G., Yu, B.: The composite absolute penalties family for grouped and hierarchical variable selection. Ann. Stat. 37, 3468–3497 (2009)
Zhou, Y., Jin, R., Hoi, S.C.H.: Exclusive lasso for multi-task feature selection. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 988–995 (2010)
Funding
Part of the work of Jianhua Z. Huang was done when he was with Texas A &M University and was partly supported by NSF Grants No. 1956219 and 1900990. Huang was also partly supported by funding from the Pengcheng Peacock Program of Shenzhen.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Zhong, Y., Huang, J.Z. Biclustering via structured regularized matrix decomposition. Stat Comput 32, 37 (2022). https://doi.org/10.1007/s11222-022-10095-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-022-10095-1