Skip to main content
Log in

Non-negative enhanced discriminant matrix factorization method with sparsity regularization

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Efficient low-rank representation of data plays a significant role in the field of computer vision and pattern recognition. In order to obtain a more discriminant and sparse low-dimensional representation, a novel non-negative enhanced discriminant matrix factorization method with sparsity regularization is proposed in this paper. Firstly, the local invariance and discriminant information of the low-dimensional representation are incorporated into the objective function to construct a new within-class encouragement constraint term, and the weighted coefficients are introduced to further enhance the compactness between the samples that belong to the same class in the new base space. Secondly, a new between-class penalty term is constructed to maximize the difference between different classes of samples, and meanwhile, the weighted coefficients are introduced to further enhance the discreteness and discriminativeness between classes. Finally, to learn the part-based representation of data better, the sparse constraint term is further introduced, and consequently, the sparseness of data representation, the local invariance, and the discriminativeness are integrated into a unified framework. Moreover, the optimization solution and the convergence proof of objective function are given. The extensive experiments demonstrate the strong robustness of the proposed method to face recognition and image classification under occlusions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26

Similar content being viewed by others

References

  1. Jolliffe IT (1989) Principal component analysis. Springer, New York

    MATH  Google Scholar 

  2. Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugenics 7(2):179–188

    Article  Google Scholar 

  3. Shahid N, Kalofolias V, Bresson X et al (2015) Robust principal component analysis on graphs. In: IEEE conference on computer vision, pp 2812–2820

  4. Hu Z, Pan G, Wang Y et al (2016) Sparse Principal Component Analysis via rotation and truncation. IEEE Trans Neural Netw Learn Syst 27(4):875–890

    Article  MathSciNet  Google Scholar 

  5. Lu M, Huang JZ, Qian X (2016) Sparse exponential family Principal Component Analysis. Pattern Recogn 60:681–691

    Article  MATH  Google Scholar 

  6. Khalid MI, Alotaiby T, Aldosari SA et al (2016) Epileptic MEG spikes detection using common spatial patterns and linear discriminant analysis. IEEE Access 4:4629–4634

    Article  Google Scholar 

  7. Ye Q, Yang J, Liu F et al (2016) L1-norm distance linear discriminant analysis based on an effective iterative algorithm. IEEE Trans Circuits Syst Video Technol 99:1–1

    Google Scholar 

  8. Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp 556–562

  9. Trigeorgis G, Bousmalis K, Zafeiriou S et al (2017) A deep matrix factorization method for learning attribute representations. IEEE Trans Pattern Anal Mach Int 39(3):417–429

    Article  Google Scholar 

  10. Sun F, Xu M, Hu X et al (2016) Graph regularized and sparse nonnegative matrix factorization with hard constraints for data representation. Neurocomputing 173(2):233–244

    Article  Google Scholar 

  11. Zhang R, Hu Z, Pan G et al (2016) Robust discriminative non-negative matrix factorization. Neurocomputing 173(3):552–561

    Article  Google Scholar 

  12. Wang D, Gao X, Wang X (2016) Semi-supervised nonnegative matrix factorization via constraint propagation. IEEE Trans Cybern 46(1):233–244

    Article  Google Scholar 

  13. Li J, Bioucas-Dias JM, Plaza A et al (2016) Robust collaborative nonnegative matrix factorization for hyperspectral unmixing. IEEE Trans Geosci Remote Sens 54(10):6076–6090

    Article  Google Scholar 

  14. Tepper M, Sapiro G (2016) Compressed nonnegative matrix factorization is fast and accurate. IEEE Trans Sig Process 64(9):2269–2283

    Article  MathSciNet  MATH  Google Scholar 

  15. Liu JX, Wang D, Gao YL et al (2017) Regularized non-negative matrix factorization for identifying differential genes and clustering samples: a survey. IEEE/ACM Trans Comput Biol Bioinform 99:1-1

    Google Scholar 

  16. Li SZ, Hou XW, Zhang HJ et al (2001) Learning spatially localized, parts-based representation. In: IEEE conference on computer vision and pattern recognition, pp 207–212

  17. Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469

    MathSciNet  MATH  Google Scholar 

  18. Yang Z, Xiang Y, Xie K et al (2016) Adaptive method for nonsmooth nonnegative matrix factorization. IEEE Trans Neural Netw Learn Syst 28(4):948–960

    Article  Google Scholar 

  19. Jia YWY, Turk CHM (2004) Fisher non-negative matrix factorization for learning local features. In: Proceedings of Asian conference on computer vision, pp 27–30

  20. Zafeiriou S, Tefas A, Buciu I et al (2006) Exploiting discriminant information in nonnegative matrix factorization with application to frontal face verification. Neural Netw 17(3):683–695

    Article  Google Scholar 

  21. Nikitidis S, Tefas A, Pitas I (2014) Projected gradients for subclass discriminant nonnegative subspace learning. IEEE Trans Cybernet 44(12):2806–2819

    Article  MATH  Google Scholar 

  22. Guan N, Tao D, Luo Z et al (2011) Manifold regularized discriminative nonnegative matrix factorization with fast gradient descent. IEEE Trans Image Process 20(7):2030–2048

    Article  MathSciNet  MATH  Google Scholar 

  23. Lu Y, Lai Z, Xu Y et al (2016) Nonnegative Discriminant Matrix Factorization. IEEE Trans Circuits Syst Video Technol 99:1–1

    Google Scholar 

  24. Chen WS, Zhao Y, Pan B et al (2016) Supervised kernel nonnegative matrix factorization for face recognition. Neurocomputing 205:165–181

    Article  Google Scholar 

  25. Cai D, He X, Han J et al (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Int 33(8):1548–1560

    Article  Google Scholar 

  26. Long X, Lu H, Peng Y et al (2014) Graph regularized discriminative non-negative matrix factorization for face recognition. Multimed Tools Appl 72(3):2679–2699

    Article  Google Scholar 

  27. Liao Q, Zhang Q (2016) Local coordinate based graph-regularized NMF for image representation. IEEE Trans Sig Process 124:103–114

    Google Scholar 

  28. Li X, Cui G, Dong Y (2016) Graph regularized non-negative low-rank matrix factorization for image clustering. IEEE Trans Cybern 99:1–14

    Google Scholar 

  29. Zheng WS, Lai JH, Liao S et al (2012) Extracting non-negative basis images using pixel dispersion penalty. Pattern Recogn 45(8):2912–2926

    Article  Google Scholar 

  30. Feng Y, Xiao J, Zhou K et al (2015) A locally weighted sparse graph regularized Non-Negative Matrix Factorization method. Neurocomputing 169:68–76

    Article  Google Scholar 

  31. Cai D, Wang X, He X (2009) Probabilistic dyadic data analysis with local and global consistency. In: Proceedings of the 26th annual international conference on machine learning, pp 105–112

  32. He X, Niyogi P (2003) Locality preserving projections. Adv Neural Inf Process Syst 16:153–160

    Google Scholar 

  33. Sprechmann P, Bronstein AM, Sapiro G (2015) Learning efficient sparse and low rank models. IEEE Trans Pattern Anal Mach Int 37(9):1821–1833

    Article  Google Scholar 

  34. Naseem I, Togneri R, Bennamoun M (2010) Linear regression for face recognition. IEEE Trans Pattern Anal Mach Int 32(11):2106–2112

    Article  Google Scholar 

  35. Martinez AR, Benavente R (1998) The AR face database. CVC technical report 24, Barcelona, Spain

  36. Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of the second IEEE workshop on applications of computer vision, pp 138–142

  37. Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Int 19(7):711–720

    Article  Google Scholar 

  38. Lyons MJ, Budynek J, Akamatsu S (1999) Automatic classification of single facial images. IEEE Trans Pattern Anal Mach Int 21(12):1357–1362

    Article  Google Scholar 

  39. Nie F, Yuan J, Huang H (2014) Optimal mean robust principal component analysis. In: Proceedings of the 31st international conference on machine learning, pp 1062–1070

  40. Li Z, Liu J, Tang J et al (2015) Robust structured subspace learning for data representation. IEEE Trans Pattern Anal Mach Int 37(10):2085–2098

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported partially by National Natural Science Foundation of China (Grant No. 61072110), Science and Technology Overall Innovation Project of Shaanxi Province (Grant 2013KTZB03-03-03), Industrial Research Project of Shaanxi Province (Grant BD12015020001), and International Cooperation Project of Shaanxi Province (Grant BD18015050001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ming Tong.

Ethics declarations

Conflict of interest

All the authors of the manuscript declared that there are no potential conflicts of interest.

Human and animal rights

All the authors of the manuscript declared that there is no research involving human participants and/or animal.

Informed consent

All the authors of the manuscript declared that there is no material that required informed consent.

Appendix: Proof of Theorem 1

Appendix: Proof of Theorem 1

To prove Theorem 1, it is necessary to show the nonincreasing property of objective function in Eq. (14) under the iteration rules in Eqs. (20) and (21). As the iteration rule of Eq. (20) is identical to original NMF, the convergence proof of NMF can be adopted to manifest that objective function is nonincreasing under the iteration rule in Eq. (20), and it is only needed to prove that the objective function is nonincreasing under Eq. (21). An auxiliary function approximate to that utilized in the Expectation Maximization (EM) algorithm is adopted in the proof process.

Definition 1

If the conditions \( G(M,M^{(t)} ) \ge F(M),G(M,M) = F(M) \) hold, then \( G(M,M^{(t)} ) \) is an auxiliary function of \( F(M) \).

Lemma 1

If\( G(M,M^{(t)} ) \)is an auxiliary function of\( F(M) \), then\( F(M) \)is nonincreasing under the following update condition.

$$ M^{(t + 1)} = \arg \min_{M} G(M,M^{(t)} ) $$
(26)

Proof

\( F(M^{(t + 1)} ) \le G(M^{(t + 1)} ,M^{(t)} ) \le G(M^{(t)} ,M^{(t)} ) = F(M^{(t)} ) \).

Lemma 2

Function\( G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) \), namely an auxiliary function of\( F({\mathbf{H}}(k)) \), and\( F({\mathbf{H}}(k)) \)are given by Eqs. (27) and (28), respectively:

$$ \begin{aligned} G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) & = \sum\limits_{i,j} {\left( {B_{i,j} (k)\log B_{i,j} (k) - B_{i,j} (k)} \right)} - \sum\limits_{i,j} {B_{i,j} (k)} \sum\limits_{m} {\left( {\frac{{Z_{i,m} (k)H_{m,j}^{(t)} (k)}}{{\sum\limits_{m} {Z_{i.m} (k)H_{m,j}^{(t)} (k)} }}} \right.} \\ & \quad \times \left. {\left( {\log (Z_{i,m} (k)H_{m,j} (k)) - \log \frac{{Z_{i,m} (k)H_{m,j}^{(t)} (k)}}{{\sum\limits_{m} {Z_{i.m} (k)H_{m,j}^{(t)} (k)} }}} \right)} \right){ + }\sum\limits_{i,j,k} {Z_{i,m} (k)H_{m,j} (k)} \\ & \quad { + }\gamma \sum\limits_{i = 1}^{C} {\sum\limits_{j \ne m}^{{N_{i} }} {\frac{{rat_{i} }}{{N_{i} (N_{i} - 1)}}\left\| {{\varvec{\upeta}}_{j}^{(i)} (k) - {\varvec{\upeta}}_{m}^{(i)} (k)} \right\|_{2}^{2} } } {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + \lambda \sum\limits_{i,j} {H_{i,j} (k)} \\ & \quad - \frac{\delta }{C(C - 1)}\sum\limits_{i \ne j}^{C} {W^{\prime}_{i,j} (k)\left\| {{\varvec{\upmu}}^{(i)} (k) - {\varvec{\upmu}}^{(j)} (k)} \right\|_{2}^{2} } \\ \end{aligned} $$
(27)
$$ \begin{aligned} F\left( {{\mathbf{H}}(k)} \right) = D_{NEDMF\_SR} & = \sum\limits_{i,j} {\left( {B_{i,j} (k)\log \frac{{B_{i,j} (k)}}{{\left( {{\mathbf{Z}}(k){\mathbf{H}}(k)} \right)_{i,j} }} - B_{i,j} (k) + ({\mathbf{Z}}(k){\mathbf{H}}(k))_{i,j} } \right)} \\ {\kern 1pt} & \quad + \gamma \sum\limits_{i = 1}^{C} {\sum\limits_{j \ne m}^{{N_{i} }} {\frac{{rat_{i} }}{{N_{i} (N_{i} - 1)}}\left\| {{\varvec{\upeta}}_{j}^{(i)} (k) - {\varvec{\upeta}}_{m}^{(i)} (k)} \right\|_{2}^{2} } } \\ & \quad - \frac{\delta }{C(C - 1)}\sum\limits_{{{\kern 1pt} i \ne j}}^{C} {W^{\prime}_{i,j} (k)\left\| {{\varvec{\upmu}}^{(i)} (k) - {\varvec{\upmu}}^{(j)} (k)} \right\|_{2}^{2} } \\ & \quad + \lambda \sum\limits_{i,j} {H_{i,j} (k)} \\ \end{aligned} $$
(28)

Proof

It is easy to find that \( G({\mathbf{H}}(k),{\mathbf{H}}(k)) = F({\mathbf{H}}(k)) \). According to Lemma 1, it is only need to show \( G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) \ge F({\mathbf{H}}(k)) \) to prove that \( G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) \) is an auxiliary function of \( F({\mathbf{H}}(k)) \).

Due to the convexity of \( \log \left( {\sum\nolimits_{m} {Z_{i,m} (k)H_{m,j} (k)} } \right) \), the following inequality holds for each non-negative element \( a_{m} \), all of which subject to the condition of \( \sum\nolimits_{m} {a_{m} = 1} \).

$$ - \log \left( {\sum\limits_{m} {Z_{i,m} (k)H_{m,j} (k)} } \right) \le - \sum\limits_{m} {a_{m} \log \frac{{Z_{i,m} (k)H_{m,j} (k)}}{{a_{m} }}} $$
(29)

Assume \( a_{m} = Z_{i,m} (k)H_{m,j}^{(t)} (k) \ \sum\nolimits_{m} Z_{i,m}(k)H_{m,j}^{(t)} (k)\), Eq. (29) can be transformed as follows:

$$ - \log \left( {\sum\limits_{m} {Z_{i,m} (k)H_{m,j} (k)} } \right) \le - \sum\nolimits_{k} {\frac{{Z_{i,m} (k)H_{m,j}^{(t)} (k)}}{{\sum\limits_{k} {Z_{i,m} (k)H_{m,j}^{(t)} (k)} }}\left( {\log \left( {Z_{i,m} (k)H_{m,j} (k)} \right) - \log \frac{{Z_{i,m} (k)H_{m,j}^{(t)} (k)}}{{\sum\nolimits_{m} {Z_{i,m} (k)H_{m,j}^{(t)} (k)} }}} \right)} $$
(30)

From Eq. (30), it is easily observed that \( G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) \ge F({\mathbf{H}}(k)) \). Consequently, \( G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) \) can be viewed as an auxiliary function of \( F({\mathbf{H}}(k)) \).

Proof of Theorem 1

The minimum of \( G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) \) in regard to \( H_{m,l} (k) \) is obtained by setting the gradient to zero:

$$ \begin{aligned} \frac{{\partial G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k))}}{{\partial H_{m,l} (k)}} & = - \sum\nolimits_{i} {{{B}}_{i,l} (k)\frac{{Z_{i,m} (k)H_{m,l}^{(t)} (k)}}{{\sum\nolimits_{n} {Z_{i,n} (k)H_{n,l}^{(t)} (k)} }}} \frac{1}{{H_{m,l} (k)}} \\ & \quad + \sum\nolimits_{i} {Z_{i,m} (k)} + \frac{{4rat_{r} \gamma }}{{N_{r} - 1}}\left( {H_{m,l} (k) - \mu_{m}^{(r)} (k)} \right) \\ & \quad - \frac{{4\delta W^{\prime}_{i,r} \left( k \right)}}{{N_{r} C\left( {C - 1} \right)}}\sum\limits_{i \ne r}^{C} {\left( {\mu_{m}^{(r)} (k) - \mu_{m}^{(i)} (k)} \right)} + \lambda \\ & = 0 \\ \end{aligned} $$
(31)

The above equation is a quadratic equation of \( H_{m,l} (k) \), and by solving it, the iterative rule of Eq. (20) can be obtained. According to Lemma 1, now that \( G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) \) is an auxiliary function, and then, the function \( F({\mathbf{H}}(k)) \), i.e., \( D_{NEDMF\_SR} \), is nonincreasing under the iterative rule in Eq. (20).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tong, M., Bu, H., Zhao, M. et al. Non-negative enhanced discriminant matrix factorization method with sparsity regularization. Neural Comput & Applic 31, 3117–3140 (2019). https://doi.org/10.1007/s00521-017-3258-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-017-3258-3

Keywords

Navigation