Robust automated graph regularized discriminative non-negative matrix factorization

Long, Xianzhong; Xiong, Jian; Chen, Lei

doi:10.1007/s11042-020-10410-w

Robust automated graph regularized discriminative non-negative matrix factorization

Published: 30 January 2021

Volume 80, pages 14867–14886, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

371 Accesses
4 Citations
Explore all metrics

Abstract

Non-negative matrix factorization (NMF) and its variants have been widely employed in clustering and classification task. However, the existing methods do not consider robustness, adaptive graph learning and discrimination information at the same time. To solve this problem, a new nonnegative matrix factorization method is proposed, which is called robust automated graph regularized discriminative non-negative matrix factorization (RAGDNMF). Specifically, L_2,1 norm is used to describe the reconstruction error, the appropriate Laplacian graph is automatically learned and the label information of the training set is added as the regularization term. The ultimate goal is to learn a good projection matrix, which can remove redundant information while preserving the effective components. In addition, we give the multiplicative updating rules for solving optimization problems and convergence proof of objective function. Face recognition experiments on four benchmark datasets show the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiple Graph Regularized Non-negative Matrix Factorization Based on L2,1 Norm for Face Recognition

Orthogonal Dual Graph Regularized Nonnegative Matrix Factorization

Supervised Non-negative Matrix Factorization Induced by Huber Loss

References

Arora S, Ge R, Kannan R, Moitra A (2012) Computing a nonnegative matrix factorization-provably. In: ACM symposium on theory of computing, pp 145–162
Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Article Google Scholar
Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396
Article Google Scholar
Berry M, Browne M, Langville A, Pauca V, Plemmons R (2007) Algorithms and applications for approximate nonnegative matrix factorization. Comput Stat Data Anal 52(1):155–173
Article MathSciNet Google Scholar
Bi H, Li N, Guan H, Lu D, Yang L (2019) A multi-scale conditional generative adversarial network for face sketch synthesis. In: IEEE international conference on image processing. IEEE, pp 3876–3880
Brunet JP, Tamayo P, Golub TR, Mesirov JP (2004) Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci 101(12):4164–4169
Article Google Scholar
Cai D, He X, Han J, Huang T (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans pattern Anal Mach Intell 33(8):1548–1560
Article Google Scholar
Cho MA, Kim T, Kim IJ, Lee S (2021) Relational deep feature learning for heterogeneous face recognition. IEEE Trans Inf Forensic Secur 16:376–388
Article Google Scholar
Ding CHQ, Li T, Jordan MI (2009) Nonnegative matrix factorization for combinatorial optimization: spectral clustering, graph matching, and clique finding. In: IEEE international conference on data mining. IEEE, pp 183–192
Fan DP, Zhang S, Wu Y, Cheng MM, Ren B, Ji R, Rosin P (2019) Scoot: a perceptual metric for facial sketches. In: IEEE international conference on computer vision, pp 1–11
Feng G, Zhang W, Wang C, Luo Z (2017) GNMF revisited: joint robust k-NN graph and reconstruction-based graph regularization for image clustering. In: International conference on artificial neural networks. Springer, Cham, pp 442–449
Graham DB, Allinson NM (1998) Characterising virtual eigensignatures for general purpose face recognition. In: Face recognition. Springer, Berlin, pp 446–456
Guan N, Huang X, Long L, Luo Z, Xiang Z (2012) Graph based semi-supervised non-negative matrix factorization for document clustering. In: International conference on machine learning & applications. IEEE, pp 404–408
Hao YJ, Gao YL, Hou MX, Dai LY, Liu JX (2019) Hypergraph regularized discriminative nonnegative matrix factorization on sample classification and co-differentially expressed gene selection. Complexity 2019:1–12
Article Google Scholar
Huang Y, Wang Y, Tai Y, Liu X, Shen P, Li S, Li J, Huang F (2020) Curricularfface: adaptive curriculum learning loss for deep face recognition. In: International conference on computer vision and pattern recognition, pp 1–10
Huang S, Xu Z, Fei W (2017) Nonnegative matrix factorization with adaptive neighbors. In: International joint conference on neural networks. IEEE, pp 486–493
Jia Y, Kwong S, Hou J, Wu W (2020) Semi-supervised nonnegative matrix factorization with dissimilarity and similarity regularization. IEEE Trans Neural Netw Learn Syst 31(7):2510–2521
MathSciNet Google Scholar
Jin H, Nie F, Huang H, Ding C (2014) Robust manifold nonnegative matrix factorization. ACM Trans Knowl Discov Data 8(3):1–21
Article Google Scholar
Kong D, Ding CHQ, Huang H (2011) Robust nonnegative matrix factorization using l21-norm. In: ACM international conference on information & knowledge management, pp 673–682
Lee D, Seung H (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Article Google Scholar
Lee D, Seung H (2001) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp 556–562
Lee H, Yoo J, Choi S (2010) Semi-supervised nonnegative matrix factorization. IEEE Sig Process Lett 17(1):4–7
Article Google Scholar
Li S, Hou X, Zhang H, Cheng Q (2001) Learning spatially localized, parts-based representation. In: IEEE conference on computer vision and pattern recognition. IEEE, pp 207–212
Li CG, You C, Vidal R (2017) Structured sparse subspace clustering: a joint affinity learning and subspace clustering framework. IEEE Trans Image Process 26(6):2988–3001
Article MathSciNet Google Scholar
Ling X, Hao D, Wei J, Tang K (2017) Nonnegative matrix factorization by joint locality-constrained and l21-norm regularization. Multimed Tools Appl 77(7):1–20
Google Scholar
Liu JX, Wang D, Gao YL, Zheng CH, Yu J (2018) Regularized non-negative matrix factorization for identifying differential genes and clustering samples: a survey. IEEE/ACM Trans Comput Biol Bioinforma 15(3):974–987
Article Google Scholar
Liu H, Wu Z, Cai D, Huang TS (2011) Constrained nonnegative matrix factorization for image representation. IEEE Trans Pattern Anal Mach Intell 34(7):1299–1311
Article Google Scholar
Logothetis N, Sheinberg D (1996) Visual object recognition. Ann Rev Neurosci 19(1):577–621
Article Google Scholar
Long X, Lu H, Yong P, Li W (2014) Graph regularized discriminative non-negative matrix factorization for face recognition. Multimed Tools Appl 72(3):2679–2699
Article Google Scholar
Lu G, Wang Y, Zou J (2016) Low rank matrix factorization with adaptive graph regularizer. IEEE Trans Image Process 25(5):2196–2205
Article MathSciNet Google Scholar
Palmer S (1977) Hierarchical structure in perceptual representation. Cogn Psychol 9(4):441–474
Article Google Scholar
Peng C, Zhao K, Hu Y, Cheng J, Cheng Q (2017) Robust graph regularized nonnegative matrix factorization for clustering. ACM Trans Knowl Discov Data 11(3):1–30
Article Google Scholar
Petersen KB, Pedersen MS (2008) The matrix cookbook. http://matrixcookbook.com, pp 1–71
Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: IEEE workshop on applications of computer vision. IEEE, pp 138–142
Shen XY, Zhang X, Lan L, Liao Q, Luo ZG (2019) Another robust NMF: rethinking the hyperbolic tangent function and locality constraint. IEEE Access 7:31089–31102
Article Google Scholar
Shuicheng Y, Dong X, Benyu Z, Hong-Jiang Z, Qiang Y, Stephen L (2007) Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51
Article Google Scholar
Sim T, Baker S, Bsat M (2003) The CMU pose, illumination, and expression database. IEEE Trans Pattern Anal Mach Intell 25(12):1615–1618
Article Google Scholar
Teng Y, Qi S, Yin D, Xu L, Wei Q, Yan K (2017) Semi-supervised nonnegative matrix factorization with commonness extraction. Neural Process Lett 45(3):1063–1076
Article Google Scholar
Vavasis SA (2010) On the complexity of nonnegative matrix factorization. SIAM J Optim 20(3):1364–1377
Article MathSciNet Google Scholar
Wachsmuth E, Oram M, Perrett D (1994) Recognition of objects and their component parts: responses of single units in the temporal cortex of the macaque. Cereb Cortex 4(5):509–522
Article Google Scholar
Wang YX, Zhang YJ (2013) Nonnegative matrix factorization: a comprehensive review. IEEE Trans Knowl Data Eng 25(6):1336–1353
Article MathSciNet Google Scholar
Wen J, Zheng T, Liu X, Wei L (2013) Neighborhood preserving orthogonal PNMF feature extraction for hyperspectral image classification. IEEE J Sel Top Appl Earth Obs Remote Sens 6(2):759–768
Article Google Scholar
Wu B, Wang E, Zhen Z, Wei C, Xiao P (2018) Manifold NMF with l21 norm for clustering. Neurocomputing 273:78–88
Article Google Scholar
Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: ACM SIGIR conference on research and development in informaion retrieval, pp 267–273
Yan J, Li C, Li Y, Cao G (2018) Adaptive discrete hypergraph matching. IEEE Trans Cybern 48(2):765–779
Article Google Scholar
Yang S, Hou C, Zhang C, Wu Y, Weng S (2013) Robust non-negative matrix factorization via joint sparse and graph regularization. In: International joint conference on neural networks. IEEE, pp 1–5
Yi Y, Wang J, Zhou W, Zheng C, Qiao S (2020) Non-negative matrix factorization with locality constrained adaptive graph. IEEE Trans Circ Syst Video Technol 30(2):427–441
Article Google Scholar
Zhang Q, Li B (2010) Discriminative K-SVD for dictionary learning in face recognition. In: IEEE conference on computer vision and pattern recognition. IEEE, pp 2691–2698

Download references

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China Grant (No. 61906098, No. 61701258, No. 61872190, No. 61906099, No.61972210), Natural Science Foundation of the Jiangsu Higher Education Institutions of China Grant (No. 18KJB520034).

Author information

Authors and Affiliations

School of Computer Science & Technology, School of Software, Nanjing University of Posts and Telecommunications, 210023, Nanjing, China
Xianzhong Long & Lei Chen
National Engineering Research Center of Communications and Networking, Nanjing University of Posts and Telecommunications, 210003, Nanjing, China
Jian Xiong

Authors

Xianzhong Long
View author publications
You can also search for this author in PubMed Google Scholar
Jian Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Lei Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xianzhong Long.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: : (Proof of Theorem)

In the process of theorem proving, the relevant knowledge of matrix calculation can refer to reference book [33]. Due to the use of more knowledge points, this paper will not elaborate in detail for matrix calculation.

In order to prove Theorem, we need to show that O₁ is non-increasing under the updating steps in (25), (26), (27) and (28). For the objective function O₁, we need to fix H, C and A if we update W, so, the first term of O₁ exists. Similarly, we need to fix W, H and A if we update C, the second term of O₁ exists. If we update A, we need to fix W, C and H, the fourth term of O₁ exists. Therefore, we have exactly the same update formula for W, C and A in RAGDNMF as in the original NMF. Thus, we can use the convergence proof of NMF to show that O₁ is nonincreasing under the update step in (25), (27) and (28). These details can be found in [21].

Hence, we only need to prove that O₁ is non-increasing under the updating step in (26). We follow the similar process depicted in [21]. Our proof make use of an auxiliary function and give the definition of the auxiliary function.

Definition 1

$G\left (h, h^{\prime }\right )$ is an auxiliary function of F(h) if the following conditions are satisfied.

$$ G(h,h^{\prime})\geq F(h),~~~~ G(h,h)=F(h) $$

(29)

The above auxiliary function is very important because of the following lemma.

Lemma 1

If G is an auxiliary function of F, then F is non-increasing under the update

$$ h^{(t+1)}=\underset{h}{\arg\min} G\left( h, h^{(t)}\right) $$

(30)

Proof

$F\left (h^{(t+1)}\right )\leq G\left (h^{(t+1)}, h^{(t)}\right )\leq G\left (h^{(t)}, h^{(t)}\right )=F\left (h^{(t)}\right )$

Now, we show that the update step for H in (26) is exactly the update in (30) with a proper auxiliary function.

Considering any element h_ab in H, we use F_ab to denote the part of O₁ which is only relevant to h_ab. It is easy to obtain the following derivatives.

$$ F_{ab}^{\prime}=\left( \frac{\partial O_{1}}{\partial\mathbf{H}}\right)_{ab}=\left[2\mathbf{W}^{T}(\mathbf{W}\mathbf{H}\mathbf{D}-\mathbf{X}\mathbf{D})+2\lambda\mathbf{H}(\mathbf{Q}-\mathbf{P})+2\gamma\mathbf{A}^{T}(\mathbf{A}\mathbf{H}\mathbf{F}-\mathbf{S}\mathbf{F}) \right]_{ab} $$

(31)

$$ F_{ab}^{\prime\prime}=2\left( \mathbf{W}^{T}\mathbf{W}\mathbf{D}\right)_{aa}+2\lambda\mathbf{Q}_{bb}-2\lambda\mathbf{P}_{bb}+2\gamma\left( \mathbf{A}^{T}\mathbf{A}\mathbf{F}\right)_{aa} $$

(32)

It is enough to show that each F_ab is nonincreasing under the update step of (26) because our update is essentially element-wise. Consequently, we introduce the following lemma. □

Lemma 2

Function

$$ \begin{array}{@{}rcl@{}} G\left( h,h_{ab}^{(t)}\right)&=&F_{ab}\left( h_{ab}^{(t)}\right)+F_{ab}^{\prime}\left( h_{ab}^{(t)}\right)\left( h-h_{ab}^{(t)}\right)\\ &&+\frac{\left( \mathbf{W}^{T}\mathbf{W}\mathbf{H}\mathbf{D}+\gamma\mathbf{A}^{T}\mathbf{A}\mathbf{H}\mathbf{F}+\lambda\mathbf{H}\mathbf{Q}\right)_{ab}}{h_{ab}^{(t)}}\left( h-h_{ab}^{(t)}\right)^{2} \end{array} $$

(33)

is an auxiliary function of F_ab.

Proof

We only need to prove that $G\left (h,h_{ab}^{(t)}\right )\geq F_{ab}(h)$ because G(h,h) = F_ab(h) is obvious. Therefore, we first consider the Taylor series expansion of F_ab(h).

$$ \begin{array}{@{}rcl@{}} F_{ab}(h)&=&F_{ab}\left( h_{ab}^{(t)}\right)+F_{ab}^{\prime}\left( h_{ab}^{(t)}\right)\left( h-h_{ab}^{(t)}\right)+\left[\left( \mathbf{W}^{T}\mathbf{W}\mathbf{D}\right)_{aa}\right.\\ &&\left.+\lambda\mathbf{Q}_{bb}-\lambda\mathbf{P}_{bb}+\gamma\left( \mathbf{A}^{T}\mathbf{A}\mathbf{F}\right)_{aa}\right]\left( h-h_{ab}^{(t)}\right)^{2} \end{array} $$

(34)

We compare the (34) with (33) to find that $G\left (h,h_{ab}^{(t)}\right )\geq F_{ab}(h)$ is equivalent to

$$ \begin{array}{ll} \frac{\left( \mathbf{W}^{T}\mathbf{W}\mathbf{H}\mathbf{D}+\gamma\mathbf{A}^{T}\mathbf{A}\mathbf{H}\mathbf{F}+\lambda\mathbf{H}\mathbf{Q}\right)_{ab}}{h_{ab}^{(t)}}\geq \left( \mathbf{W}^{T}\mathbf{W}\mathbf{D}\right)_{aa}+\lambda\mathbf{Q}_{bb}-\lambda\mathbf{P}_{bb}+\gamma\left( \mathbf{A}^{T}\mathbf{A}\mathbf{F}\right)_{aa} \end{array} $$

(35)

In fact, we have

$$ \begin{array}{@{}rcl@{}} \left( \mathbf{W}^{T}\mathbf{W}\mathbf{H}\mathbf{D}+\gamma\mathbf{A}^{T}\mathbf{A}\mathbf{H}\mathbf{F}\right)_{ab}&=&\sum\limits_{q=1}^{r}\left( \mathbf{W}^{T}\mathbf{W}\right)_{aq}h_{qb}^{(t)}\mathbf{D}_{aq} \\ &&+\gamma\sum\limits_{q=1}^{r}\left( \mathbf{A}^{T}\mathbf{A}\right)_{aq}h_{qb}^{(t)}\mathbf{F}_{aq}\geq \left( \mathbf{W}^{T}\mathbf{W}\right)_{aa}h_{ab}^{(t)}\mathbf{D}_{aa} \\ &&+\gamma\left( \mathbf{A}^{T}\mathbf{A}\right)_{aa}h_{ab}^{(t)}\mathbf{F}_{aa} \end{array} $$

(36)

and

$$ \begin{array}{ll} (\lambda\mathbf{H}\mathbf{Q})_{ab}=\lambda\sum\limits_{j=1}^{n}h_{aj}^{(t)}\mathbf{Q}_{jb}\geq \lambda h_{ab}^{(t)}\mathbf{Q}_{bb}\geq\lambda h_{ab}^{(t)}\mathbf{Q}_{bb}-\lambda h_{ab}^{(t)}\mathbf{P}_{bb} \end{array} $$

(37)

Thus, (35) holds and $G\left (h,h_{ab}^{(t)}\right )\geq F_{ab}(h)$. We can now demonstrate the convergence of Theorem: □

Proof of Theorem

Replacing $G\left (h,h_{ab}^{(t)}\right )$ in (30) by (33) results in the following update rule:

$$ \begin{array}{@{}rcl@{}} h_{ab}^{(t+1)}&=&h_{ab}^{(t)}-h_{ab}^{(t)}\frac{F_{ab}^{\prime}\left( h_{ab}^{(t)}\right)}{2\left( \mathbf{W}^{T}\mathbf{W}\mathbf{H}\mathbf{D}+\gamma\mathbf{A}^{T}\mathbf{A}\mathbf{H}\mathbf{F}+\lambda\mathbf{H}\mathbf{Q}\right)_{ab}}\\ &=&h_{ab}^{(t)}\frac{\left( \gamma\mathbf{A}^{T}\mathbf{S}\mathbf{F}+\mathbf{W}^{T}\mathbf{X}\mathbf{D}+\lambda\mathbf{H}\mathbf{P}\right)_{ab}}{\left( \mathbf{W}^{T}\mathbf{W}\mathbf{H}\mathbf{D}+\gamma\mathbf{A}^{T}\mathbf{A}\mathbf{H}\mathbf{F}+\lambda\mathbf{H}\mathbf{Q}\right)_{ab}} \end{array} $$

(38)

Since (33) is an auxiliary function and F_ab is nonincreasing under this update rule.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Long, X., Xiong, J. & Chen, L. Robust automated graph regularized discriminative non-negative matrix factorization. Multimed Tools Appl 80, 14867–14886 (2021). https://doi.org/10.1007/s11042-020-10410-w

Download citation

Received: 03 May 2020
Revised: 25 August 2020
Accepted: 22 December 2020
Published: 30 January 2021
Issue Date: April 2021
DOI: https://doi.org/10.1007/s11042-020-10410-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust automated graph regularized discriminative non-negative matrix factorization

Abstract

Access this article

Similar content being viewed by others

Multiple Graph Regularized Non-negative Matrix Factorization Based on L2,1 Norm for Face Recognition

Orthogonal Dual Graph Regularized Nonnegative Matrix Factorization

Supervised Non-negative Matrix Factorization Induced by Huber Loss

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix: : (Proof of Theorem)

Definition 1

Lemma 1

Proof

Lemma 2

Proof

Proof of Theorem

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust automated graph regularized discriminative non-negative matrix factorization

Abstract

Access this article

Similar content being viewed by others

Multiple Graph Regularized Non-negative Matrix Factorization Based on L2,1 Norm for Face Recognition

Orthogonal Dual Graph Regularized Nonnegative Matrix Factorization

Supervised Non-negative Matrix Factorization Induced by Huber Loss

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix: : (Proof of Theorem)

Appendix: : (Proof of Theorem)

Definition 1

Lemma 1

Proof

Lemma 2

Proof

Proof of Theorem

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation