Abstract
Data clustering aims to group the input data instances into certain clusters according to the high similarity to each other, and it could be regarded as a fundamental and essential immediate or intermediate task that appears in areas of machine learning, pattern recognition, and information retrieval. Clustering algorithms based on graph regularized extensions have accumulated much interest for a couple of decades, and the performance of this category of approaches is largely determined by the data similarity matrix, which is usually calculated by the predefined model with carefully tuned parameters combination. However, they may lack a more flexible ability and not be optimal in practice. In this paper, we consider both discriminative information as well as the data manifold in a matrix factorization point of view, and propose an adaptive local learning regularized nonnegative matrix factorization (ALLRNMF) approach for data clustering, which assumes that similar instance pairs with a smaller distance should have a larger probability to be assigned to the probabilistic neighbors. ALLRNMF simultaneously learns the data similarity matrix under the assumption and performs the nonnegative matrix factorization. The constraint of the similarity matrix encodes both the discriminative information as well as the learned adaptive local structure and benefits the data clustering on manifold. In order to solve the optimization problem of our approach, an effective alternative optimization algorithm is proposed such that our objective function could be decomposed into several subproblems that each has an optimal solution, and its convergence is theoretically guaranteed. Experiments on real-world benchmark datasets demonstrate the superior performance of our approach against the existing clustering approaches.
Similar content being viewed by others
Notes
Wap, La1 and tr12 are publicly available from http://archive.ics.uci.edu/ml/datasets.html.
Vote, Diag-Bcw, Abalone, Krvs and Caltech101 Silhouettes, which are publicly available from http://glaros.dtc.umn.edu/gkhome/views/cluto/.
tr12 is publicly available from http://trec.nist.gov.
TDT2-20 is publicly available from http://www.nist.gov/speech/tests/tdt/tdt98/index.htm.
References
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
Cai D, He X, Han J, Huang TS (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560
Cai D, He X, Wu X, Han J (2008) Non-negative matrix factorization on manifold. In: Proceedings of the 8th international conference on data mining. IEEE, Piscataway, pp 63–72
Cai X, Nie F, Huang H (2013) Multi-view k-means clustering on big data. In: Proceedings of the 25th international joint conference on artificial intelligence. AAAI, Cambridge, pp 2598–2604
Chung FR (1997) Spectral graph theory, vol 92 American Mathematical Soc
Ding C, Li T, Jordan MI (2009) Convex and semi-nonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell 32(1):45–55
Ding C, Li T, Peng W, Park H (2006) Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 126–135
Elhamifar E, Vidal R (2015) Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781
Gokcay E, Principe JC (2002) Information theoretic clustering. IEEE Trans Pattern Anal Mach Intell 24 (2):158–171
Gu Q, Ding C, Han J (2011) On trivial solution and scale transfer problems in graph regularized nmf. In: Proceedings of the 23rd international joint conference on artificial intelligence, vol 22. AAAI, Cambridge, pp 1288–1295
Gu Q, Zhou J (2009) Local learning regularized nonnegative matrix factorization. In: Proceedings of the 21st international joint conference on artificial intelligence. AAAI, Cambridge, pp 1046–1051
Guo X (2015) Robust subspace segmentation by simultaneously learning data representations and their affinity matrix Proceeding of the 24nd international joint conference on artificial intelligence. AAAI, Cambridge, pp 3547–3553
Hagen L, Kahng AB (2006) New spectral methods for ratio cut partitioning and clustering. IEEE Trans Comput Aided Des Integr Circuits Syst 11(9):1074–1085
Han EH, Boley D, Gini M, Gross R, Hastings K, Karypis G, Kumar V, Mobasher B, Moore J (1998) Webace:a web agent for document categorization and exploration. In: Proceedings of the 2nd international conference on autonomous agents, pp 408– 415
Huang J, Nie F, Huang H, Ding C (2014) Robust manifold nonnegative matrix factorization. ACM Trans Knowl Discov Data 8(3):11
Huang S, Wang H, Li T, Li T, Xu Z (2018) Robust graph regularized nonnegative matrix factorization for clustering. Data Min Knowl Disc 32(2):483–503
Huang S, Xu Z, Lv J (2018) Adaptive local structure learning for document co-clustering. Knowl-Based Syst 148:74–84
Jain AK (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323
Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Proceedings of the 14th advances in neural information processing systems. MIT Press, Cambridge, pp 556–562
Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y (2013) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184
Liu Y, Jiao L, Shang F (2013) A fast tri-factorization method for low-rank matrix recovery and completion. Pattern Recogn 46(1):163–173
Luxburg UV (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
MacQueen J, et al. (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the 50th Berkeley symposium on mathematical statistics and probability, vol 1, pp 281-297, Oakland, USA
Nie F, Wang X, Huang H (2014) Clustering and projected clustering with adaptive neighbors. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 977–986
Nie F, Wang X, Jordan MI, Huang H (2016) The constrained laplacian rank algorithm for graph-based clustering. In: Proceedings of the 30th AAAI conference on artificial intelligence. AAAI, Cambridge, pp 1969–1976
Peng C, Kang Z, Hu Y, Cheng J, Cheng Q (2017) Robust graph regularized nonnegative matrix factorization for clustering. ACM Trans Knowl Discov Data (TKDD) 11(3):33
Rai N, Negi S, Chaudhury S, Deshmukh O (2016) Partial multi-view clustering using graph regularized nmf. In: Proceeding of 23rd international conference on pattern recognition (ICPR). IEEE, Piscataway, pp 2192–2197
Seung HS, Lee DD (2000) The manifold ways of perception. Science 290(5500):2268–2269
Shang F, Jiao L, Wang F (2012) Graph dual regularization non-negative matrix factorization for co-clustering. Pattern Recogn 45(6):2237–2250
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22 (8):888–905
Shlens J (2014) A tutorial on principal component analysis. arXiv:1404.1100
Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Wang H, Nie F, Huang H, Makedon F (2011) Fast nonnegative matrix tri-factorization for large-scale data co-clustering. In: Proceedings of the 22nd international joint conference on artificial intelligence. AAAI, Cambridge, pp 1553–1558
Wang S, Tang J, Liu H (2015) Embedded unsupervised feature selection. In: Proceedings of the 29th AAAI conference on artificial intelligence. AAAI, Cambridge, pp 470–476
Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp 267–273
Xu Z, King I, Lyu MRT, Jin R (2010) Discriminative semi-supervised feature selection via manifold regularization. IEEE Trans Neural Netw 21(7):1033–1047
Yoo J, Choi S (2010) Orthogonal nonnegative matrix tri-factorization for co-clustering: multiplicative updates on stiefel manifolds. Inf Process Manag 46(5):559–570
Zhang L, Zhang Q, Du B, You J, Tao D (2017) Adaptive manifold regularized matrix factorization for data clustering. AAAI, Cambridge
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Proof of Theorem 2
Proof
We rewrite (33) as
By applying Lemma 2, we have
To obatin the lower bound for the remaining terms, we use the inequality that
Then
By summing over all the bounds, we can get g(U, U′), which obviously satisfies: (1) g(U, U′) ≥ JALLRNMF(U); (2) g(U, U) = JALLRNMF(U).
To find the minimum of g(U, U′), we take the Hessian matrix of g(U, U′)
which is a diagonal matrix with positive diagonal elements. So g(U, U′) is a convex function of U, and we can obtain the global minimum of g(U, U′) by setting \( \frac {\partial g(\textbf {U}, \textbf {U}^{\prime })}{\partial \textbf {U}_{ij}}= 0\) and solving for U, from which we can get (34). □
Appendix B: Proof of Theorem 4
Proof
We rewrite (35) as
By applying Lemma 2, we have
To obatin the lower bound for the remaining terms, we use the inequality in (40), then
By summing over all the bounds, we can get g(V, V′), which obviously satisfies: (1) g(V, V′) ≥ JALLRNMF(V); (2) g(V, V) = JALLRNMF(V).
To find the minimum of g(V, V′), we take the Hessian matrix of g(V, V′)
which is a diagonal matrix with positive diagonal elements. So g(V, V′) is a convex function of V, and we can obtain the global minimum of g(V, V′) by setting \( \frac {\partial g(\textbf {V}, \textbf {V}^{\prime })}{\partial \textbf {V}_{ij}}= 0\) and solving for V, from which we can get (36). □
Rights and permissions
About this article
Cite this article
Sheng, Y., Wang, M., Wu, T. et al. Adaptive local learning regularized nonnegative matrix factorization for data clustering. Appl Intell 49, 2151–2168 (2019). https://doi.org/10.1007/s10489-018-1380-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-018-1380-2