Abstract
Existing matrix factorization based techniques, such as nonnegative matrix factorization and concept factorization, have been widely applied for data representation. In order to make the obtained concepts to be as close to the original data points as possible, one state-of-the-art method called locality constraint concept factorization is put forward, which represent the data by a linear combination of only a few nearby basis concepts. But its locality constraint does not well reveal the intrinsic data structure since it only requires the concept to be as close to the original data points as possible. To address these problems, by considering the manifold geometrical structure in local concept factorization via graph-based learning, we propose a novel algorithm, called graph-regularized local coordinate concept factorization (GRLCF). By constructing a parameter-free graph using constrained Laplacian rank (CLR) algorithm, we also present an extension of GRLCF algorithm as \(\hbox {GRLCF}_{\mathrm{CLR}}\). Moreover, we develop the iterative updating optimization schemes, and provide the convergence proof of our optimization scheme. Since GRLCF simultaneously considers the geometric structures of the data manifold and the locality conditions as additional constraints, it can obtain more compact and better structured data representation. Experimental results on ORL, Yale and Mnist image datasets demonstrate the effectiveness of our proposed algorithm.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11063-017-9598-2/MediaObjects/11063_2017_9598_Fig1_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11063-017-9598-2/MediaObjects/11063_2017_9598_Fig2_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11063-017-9598-2/MediaObjects/11063_2017_9598_Fig3_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11063-017-9598-2/MediaObjects/11063_2017_9598_Fig4_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11063-017-9598-2/MediaObjects/11063_2017_9598_Fig5_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11063-017-9598-2/MediaObjects/11063_2017_9598_Fig6_HTML.gif)
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Zhao M, Chow TWS, Zhang Z, Wu Z (2015) Learning from normalized local and global discriminative information for semi-supervised regression and dimensionality reduction. Inf Sci 324(10):286–309
Zhao M, Zhang Z, Chow TWS (2012) Trace ratio criterion based generalized discriminative learning for semi-supervised dimension reduction. Pattern Recognit 45(4):1482–1499
Zhao M, Zhang Z, Chow TWS, Li B (2014) A general soft label based linear discriminant analysis for semi-supervised dimension reduction. Neural Netw 55:83–97
Li P, Chun C, Bu J (2012) Clustering analysis using manifold kernel concept factorization. Neurocomputing 87:120–131
Jain A, Murty M, Flynn P (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323
MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley symposium on mathematical statistics and probability, University of California Press, Berkeley, pp 281–297
Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst (NIPS) 14:849–856
Nie F, Zeng Z, Tsang IW, Xu D, Zhang C (2011) Spectral embedded clustering: a framework for in-sample and out-of-sample spectral clustering. IEEE Trans Neural Netw 22(11):1796–1808
Yang Y, Shen H, Nie F et al (2011) Nonnegative spectral clustering with discriminative regularization. In: Proceedings of the 25th AAAI conference on artificial intelligence (AAAI’ 11), pp 555–560
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791
Xu W, Gong Y (2004) Document clustering by concept factorization. In: Proceedings of the 2004 international conference on research and development in information retrieval (SIGIR’04), Sheffield, UK, pp 202–209
Nie F, Ding CHQ, Luo D, Huang H(2010) Improved minmax cut graph clustering with nonnegative relaxation. In: ECML/PKDD, pp 451–466
Huang J, Nie F, Huang H, Ding C (2014) Robust manifold nonnegative matrix factorization. ACM Trans Knowl Discov Data 8(3):21, Article 11
Lu M, Zhao X, Zhang L, Li F (2016) Semi-supervised concept factorization for document clustering. Inf Sci 331:86–98
Belkin and M, Niyogi P(2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in neural information processing systems 14. MIT Press, Cambridge, MA, pp 585–591
Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Tenenbaum J, de Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from examples. J Mach Learn Res 7:2399–2434
Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Cai D, He X, Han J, Huang T (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33:1548–1560
Cai D, He X, Han J (2011) Locally consistent concept factorization for document clustering. IEEE Trans Knowl Data Eng 23(6):902–913
Nie F, Wang X, Jordan MI, Huang H(2016) The constrained Laplacian rank algorithm for graph-based clustering. In: The 30th AAAI conference on artificial intelligence (AAAI), Phoenix, USA
Nie F, Wang X, Huang H(2014) Clustering and projected clustering with adaptive neighbors. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 977–986
Yu K, Zhang T, Gong Y (2009) Nonlinear learning using local coordinate coding. In: Proceedings of the advances in neural information processing systems, pp 2223–2231
Chen Y, Zhang J, Cai D, Liu W, He X (2013) Nonnegative local coordinate factorization for image representation. IEEE Trans Image Process 22(3):969–979
Liu H, Yang Z, Yang J, Wu Z, Li X (2014) Local coordinate concept factorization for image representation. IEEE Trans Neural Netw Learn Syst 25(6):1071–1081
Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Process Syst (NIPS) 13:556–562
Zhao M, Chow TWS, Zhang Z, Li B (2015) Automatic image annotation via compact graph based semi-supervised learning. Knowl Based Syst 76:148–165
Sha F, Saul LK, Lee DD (2007) Multiplicative updates for nonnegative quadratic programming. Neural Comput 19(8):2004–2031
Lovasz L, Plummer M (1986) Matching theory. Akad’emiai Kiad’o, Budapest
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Series B (Methodol) 39(1):1–38
Acknowledgements
This work is partially supported by the National Natural Science Foundation of China under Grant Nos. 61373063, 61233011, 61125305, 61375007, 61220301, and by National Basic Research Program of China under Grant No. 2014CB349303. Also this work is supported in part by the Natural Science Foundation of Jiangsu Province (BK20150867), the Natural Science Research Foundation for Jiangsu Universities (13KJB510022), and the Talent Introduction Foundation and Natural Science Foundation of Nanjing University of Posts and Telecommunications (NY212014, NY212039, NY215125).
Author information
Authors and Affiliations
Corresponding author
Appendix: (Proof of Theorem 1)
Appendix: (Proof of Theorem 1)
To prove Theorem 1, we need to show that the objective function \({\varvec{J}}_{\varvec{GRLCF}} \) in Eq. (13) is nonincreasing under the updating rules stated in Eqs. (16) and (17). Now, we make use of an auxiliary function similar to that used in the EM algorithm [31] to prove the convergence of the theorem 1. We begin with the definition of the auxiliary function.
Definition 2
The function \(G\left( {x,{x}'} \right) \) is an auxiliary function for F(x), if the \(G\left( {x,{x}'} \right) \ge F(x)\) and \(G\left( {x,x} \right) =F(x)\) are satisfied.
The auxiliary function is very useful because of the following lemma.
Lemma 1
if G is an auxiliary function of F, then F is nonincreasing under the update
Proof
\(F\left( {x^{(t+1)}} \right) \le G\left( {x^{(t+1)},x^{(t)}} \right) \le G\left( {x^{(t)},x^{(t)}} \right) =F\left( {x^{(t)}} \right) \).
Next we will show that the updating rule for \({\varvec{V}}\) in Eq. (17) is exactly the update in Eq. (27) with a proper auxiliary function.
Considering any element \(v_{ab} \) in \({\varvec{V}}\), we use \(F_{v_{ab} } \) to denote the part of \({\varvec{J}}_{\varvec{GRLCF}} \) which is only relevant to \(v_{ab} \). It is easy to check that
Since our update is essentially element-wise, it is sufficient to show that each \(F_{v_{ab} } \) is nonincreasing under the update step of Eq. (17).
Lemma 2
Function
is an auxiliary function for \(F_{v_{ab} } \).
Proof
Since \(G(v,v)=F_{v_{ab} } (v)\) is obvious, we need show that \(G(v,v_{ab}^{(t)} )\ge F_{v_{ab} } (v)\). To do this, we compare the Taylor series expansion of \(F_{v_{ab} } (v)\)
with Eq. (27) to find that \(G(v,v_{ab}^{(t)} )\ge F_{v_{ab} } (v)\) is equivalent to
From the definition of \({\varvec{A}}\) and \({\varvec{B}}\), it is easy to check that \({\varvec{A}}\ge 0\) and \({\varvec{B}}\ge 0\). Thus we have
Thus, Eq. (29) holds and \(G(v,v_{ab}^{(t)} )\ge F_{v_{ab} } (v)\).
Next we define an auxiliary function for the update rule in Eq. (16). Similarly, consider any element \(w_{ab}\) in \({\varvec{W}}\), we use \(F_{w_{ab} } \) to denote the part of \({\varvec{J}}_{\varvec{GRLCF}} \) which is only relevant to \(w_{ab} \). Then the auxiliary function regarding \(w_{ab} \) is defined as follows:
Lemma 3
Function
is an auxiliary function for \(F_{w_{ab}}\).
The proof of Lemma 3 is essentially similar to the proof of Lemma 2 and is omitted here due to space limitation.
We can now demonstrate the convergence of the Theorem 1:
Proof of Theorem 5
Replacing \(G(v,v_{ab}^{(t)} )\) in Eq. (27) by Eq. (28), we get
Since Eq. (28) is an auxiliary function, \(F_{z_{ab} } \) is nonincreasing under this updating rule.
Similarly, Replacing \(G(w,w_{ab}^{(t)} )\) in Eq. (27) by Eq. (29), we get
Since Eq. (30) is an auxiliary function, \(F_{w_{ab} } \) is nonincreasing under this updating rule.
Rights and permissions
About this article
Cite this article
Ye, J., Jin, Z. Graph-Regularized Local Coordinate Concept Factorization for Image Representation. Neural Process Lett 46, 427–449 (2017). https://doi.org/10.1007/s11063-017-9598-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-017-9598-2