Non-negative enhanced discriminant matrix factorization method with sparsity regularization

Tong, Ming; Bu, Haili; Zhao, Mengao; Xi, Shengnan; Li, Hailong

doi:10.1007/s00521-017-3258-3

Non-negative enhanced discriminant matrix factorization method with sparsity regularization

Original Article
Published: 02 November 2017

Volume 31, pages 3117–3140, (2019)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Ming Tong¹,
Haili Bu¹,
Mengao Zhao¹,
Shengnan Xi¹ &
…
Hailong Li¹

631 Accesses
2 Citations
Explore all metrics

Abstract

Efficient low-rank representation of data plays a significant role in the field of computer vision and pattern recognition. In order to obtain a more discriminant and sparse low-dimensional representation, a novel non-negative enhanced discriminant matrix factorization method with sparsity regularization is proposed in this paper. Firstly, the local invariance and discriminant information of the low-dimensional representation are incorporated into the objective function to construct a new within-class encouragement constraint term, and the weighted coefficients are introduced to further enhance the compactness between the samples that belong to the same class in the new base space. Secondly, a new between-class penalty term is constructed to maximize the difference between different classes of samples, and meanwhile, the weighted coefficients are introduced to further enhance the discreteness and discriminativeness between classes. Finally, to learn the part-based representation of data better, the sparse constraint term is further introduced, and consequently, the sparseness of data representation, the local invariance, and the discriminativeness are integrated into a unified framework. Moreover, the optimization solution and the convergence proof of objective function are given. The extensive experiments demonstrate the strong robustness of the proposed method to face recognition and image classification under occlusions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Survey on SVM and their application in image classification

Article 11 January 2018

Sparse semi-supervised multi-label feature selection based on latent representation

Article Open access 17 April 2024

A hybrid method based on the completely positive-tensors and PCA for face recognition

Article 12 April 2024

References

Jolliffe IT (1989) Principal component analysis. Springer, New York
MATH Google Scholar
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugenics 7(2):179–188
Article Google Scholar
Shahid N, Kalofolias V, Bresson X et al (2015) Robust principal component analysis on graphs. In: IEEE conference on computer vision, pp 2812–2820
Hu Z, Pan G, Wang Y et al (2016) Sparse Principal Component Analysis via rotation and truncation. IEEE Trans Neural Netw Learn Syst 27(4):875–890
Article MathSciNet Google Scholar
Lu M, Huang JZ, Qian X (2016) Sparse exponential family Principal Component Analysis. Pattern Recogn 60:681–691
Article MATH Google Scholar
Khalid MI, Alotaiby T, Aldosari SA et al (2016) Epileptic MEG spikes detection using common spatial patterns and linear discriminant analysis. IEEE Access 4:4629–4634
Article Google Scholar
Ye Q, Yang J, Liu F et al (2016) L1-norm distance linear discriminant analysis based on an effective iterative algorithm. IEEE Trans Circuits Syst Video Technol 99:1–1
Google Scholar
Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp 556–562
Trigeorgis G, Bousmalis K, Zafeiriou S et al (2017) A deep matrix factorization method for learning attribute representations. IEEE Trans Pattern Anal Mach Int 39(3):417–429
Article Google Scholar
Sun F, Xu M, Hu X et al (2016) Graph regularized and sparse nonnegative matrix factorization with hard constraints for data representation. Neurocomputing 173(2):233–244
Article Google Scholar
Zhang R, Hu Z, Pan G et al (2016) Robust discriminative non-negative matrix factorization. Neurocomputing 173(3):552–561
Article Google Scholar
Wang D, Gao X, Wang X (2016) Semi-supervised nonnegative matrix factorization via constraint propagation. IEEE Trans Cybern 46(1):233–244
Article Google Scholar
Li J, Bioucas-Dias JM, Plaza A et al (2016) Robust collaborative nonnegative matrix factorization for hyperspectral unmixing. IEEE Trans Geosci Remote Sens 54(10):6076–6090
Article Google Scholar
Tepper M, Sapiro G (2016) Compressed nonnegative matrix factorization is fast and accurate. IEEE Trans Sig Process 64(9):2269–2283
Article MathSciNet MATH Google Scholar
Liu JX, Wang D, Gao YL et al (2017) Regularized non-negative matrix factorization for identifying differential genes and clustering samples: a survey. IEEE/ACM Trans Comput Biol Bioinform 99:1-1
Google Scholar
Li SZ, Hou XW, Zhang HJ et al (2001) Learning spatially localized, parts-based representation. In: IEEE conference on computer vision and pattern recognition, pp 207–212
Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469
MathSciNet MATH Google Scholar
Yang Z, Xiang Y, Xie K et al (2016) Adaptive method for nonsmooth nonnegative matrix factorization. IEEE Trans Neural Netw Learn Syst 28(4):948–960
Article Google Scholar
Jia YWY, Turk CHM (2004) Fisher non-negative matrix factorization for learning local features. In: Proceedings of Asian conference on computer vision, pp 27–30
Zafeiriou S, Tefas A, Buciu I et al (2006) Exploiting discriminant information in nonnegative matrix factorization with application to frontal face verification. Neural Netw 17(3):683–695
Article Google Scholar
Nikitidis S, Tefas A, Pitas I (2014) Projected gradients for subclass discriminant nonnegative subspace learning. IEEE Trans Cybernet 44(12):2806–2819
Article MATH Google Scholar
Guan N, Tao D, Luo Z et al (2011) Manifold regularized discriminative nonnegative matrix factorization with fast gradient descent. IEEE Trans Image Process 20(7):2030–2048
Article MathSciNet MATH Google Scholar
Lu Y, Lai Z, Xu Y et al (2016) Nonnegative Discriminant Matrix Factorization. IEEE Trans Circuits Syst Video Technol 99:1–1
Google Scholar
Chen WS, Zhao Y, Pan B et al (2016) Supervised kernel nonnegative matrix factorization for face recognition. Neurocomputing 205:165–181
Article Google Scholar
Cai D, He X, Han J et al (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Int 33(8):1548–1560
Article Google Scholar
Long X, Lu H, Peng Y et al (2014) Graph regularized discriminative non-negative matrix factorization for face recognition. Multimed Tools Appl 72(3):2679–2699
Article Google Scholar
Liao Q, Zhang Q (2016) Local coordinate based graph-regularized NMF for image representation. IEEE Trans Sig Process 124:103–114
Google Scholar
Li X, Cui G, Dong Y (2016) Graph regularized non-negative low-rank matrix factorization for image clustering. IEEE Trans Cybern 99:1–14
Google Scholar
Zheng WS, Lai JH, Liao S et al (2012) Extracting non-negative basis images using pixel dispersion penalty. Pattern Recogn 45(8):2912–2926
Article Google Scholar
Feng Y, Xiao J, Zhou K et al (2015) A locally weighted sparse graph regularized Non-Negative Matrix Factorization method. Neurocomputing 169:68–76
Article Google Scholar
Cai D, Wang X, He X (2009) Probabilistic dyadic data analysis with local and global consistency. In: Proceedings of the 26th annual international conference on machine learning, pp 105–112
He X, Niyogi P (2003) Locality preserving projections. Adv Neural Inf Process Syst 16:153–160
Google Scholar
Sprechmann P, Bronstein AM, Sapiro G (2015) Learning efficient sparse and low rank models. IEEE Trans Pattern Anal Mach Int 37(9):1821–1833
Article Google Scholar
Naseem I, Togneri R, Bennamoun M (2010) Linear regression for face recognition. IEEE Trans Pattern Anal Mach Int 32(11):2106–2112
Article Google Scholar
Martinez AR, Benavente R (1998) The AR face database. CVC technical report 24, Barcelona, Spain
Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of the second IEEE workshop on applications of computer vision, pp 138–142
Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Int 19(7):711–720
Article Google Scholar
Lyons MJ, Budynek J, Akamatsu S (1999) Automatic classification of single facial images. IEEE Trans Pattern Anal Mach Int 21(12):1357–1362
Article Google Scholar
Nie F, Yuan J, Huang H (2014) Optimal mean robust principal component analysis. In: Proceedings of the 31st international conference on machine learning, pp 1062–1070
Li Z, Liu J, Tang J et al (2015) Robust structured subspace learning for data representation. IEEE Trans Pattern Anal Mach Int 37(10):2085–2098
Article Google Scholar

Download references

Acknowledgements

This work was supported partially by National Natural Science Foundation of China (Grant No. 61072110), Science and Technology Overall Innovation Project of Shaanxi Province (Grant 2013KTZB03-03-03), Industrial Research Project of Shaanxi Province (Grant BD12015020001), and International Cooperation Project of Shaanxi Province (Grant BD18015050001).

Author information

Authors and Affiliations

School of Electronic Engineering, Xidian University, Xi’an, 710071, China
Ming Tong, Haili Bu, Mengao Zhao, Shengnan Xi & Hailong Li

Authors

Ming Tong
View author publications
You can also search for this author in PubMed Google Scholar
Haili Bu
View author publications
You can also search for this author in PubMed Google Scholar
Mengao Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shengnan Xi
View author publications
You can also search for this author in PubMed Google Scholar
Hailong Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ming Tong.

Ethics declarations

Conflict of interest

All the authors of the manuscript declared that there are no potential conflicts of interest.

Human and animal rights

All the authors of the manuscript declared that there is no research involving human participants and/or animal.

Informed consent

All the authors of the manuscript declared that there is no material that required informed consent.

Appendix: Proof of Theorem 1

To prove Theorem 1, it is necessary to show the nonincreasing property of objective function in Eq. (14) under the iteration rules in Eqs. (20) and (21). As the iteration rule of Eq. (20) is identical to original NMF, the convergence proof of NMF can be adopted to manifest that objective function is nonincreasing under the iteration rule in Eq. (20), and it is only needed to prove that the objective function is nonincreasing under Eq. (21). An auxiliary function approximate to that utilized in the Expectation Maximization (EM) algorithm is adopted in the proof process.

Definition 1

If the conditions $ G(M,M^{(t)} ) \ge F(M),G(M,M) = F(M) $ hold, then $ G(M,M^{(t)} ) $ is an auxiliary function of $ F(M) $.

Lemma 1

If$ G(M,M^{(t)} ) $is an auxiliary function of$ F(M) $, then$ F(M) $is nonincreasing under the following update condition.

$$ M^{(t + 1)} = \arg \min_{M} G(M,M^{(t)} ) $$

(26)

Proof

$ F(M^{(t + 1)} ) \le G(M^{(t + 1)} ,M^{(t)} ) \le G(M^{(t)} ,M^{(t)} ) = F(M^{(t)} ) $.

Lemma 2

Function$ G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) $, namely an auxiliary function of$ F({\mathbf{H}}(k)) $, and$ F({\mathbf{H}}(k)) $are given by Eqs. (27) and (28), respectively:

$$ \begin{aligned} G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) & = \sum\limits_{i,j} {\left( {B_{i,j} (k)\log B_{i,j} (k) - B_{i,j} (k)} \right)} - \sum\limits_{i,j} {B_{i,j} (k)} \sum\limits_{m} {\left( {\frac{{Z_{i,m} (k)H_{m,j}^{(t)} (k)}}{{\sum\limits_{m} {Z_{i.m} (k)H_{m,j}^{(t)} (k)} }}} \right.} \\ & \quad \times \left. {\left( {\log (Z_{i,m} (k)H_{m,j} (k)) - \log \frac{{Z_{i,m} (k)H_{m,j}^{(t)} (k)}}{{\sum\limits_{m} {Z_{i.m} (k)H_{m,j}^{(t)} (k)} }}} \right)} \right){ + }\sum\limits_{i,j,k} {Z_{i,m} (k)H_{m,j} (k)} \\ & \quad { + }\gamma \sum\limits_{i = 1}^{C} {\sum\limits_{j \ne m}^{{N_{i} }} {\frac{{rat_{i} }}{{N_{i} (N_{i} - 1)}}\left\| {{\varvec{\upeta}}_{j}^{(i)} (k) - {\varvec{\upeta}}_{m}^{(i)} (k)} \right\|_{2}^{2} } } {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + \lambda \sum\limits_{i,j} {H_{i,j} (k)} \\ & \quad - \frac{\delta }{C(C - 1)}\sum\limits_{i \ne j}^{C} {W^{\prime}_{i,j} (k)\left\| {{\varvec{\upmu}}^{(i)} (k) - {\varvec{\upmu}}^{(j)} (k)} \right\|_{2}^{2} } \\ \end{aligned} $$

(27)

$$ \begin{aligned} F\left( {{\mathbf{H}}(k)} \right) = D_{NEDMF\_SR} & = \sum\limits_{i,j} {\left( {B_{i,j} (k)\log \frac{{B_{i,j} (k)}}{{\left( {{\mathbf{Z}}(k){\mathbf{H}}(k)} \right)_{i,j} }} - B_{i,j} (k) + ({\mathbf{Z}}(k){\mathbf{H}}(k))_{i,j} } \right)} \\ {\kern 1pt} & \quad + \gamma \sum\limits_{i = 1}^{C} {\sum\limits_{j \ne m}^{{N_{i} }} {\frac{{rat_{i} }}{{N_{i} (N_{i} - 1)}}\left\| {{\varvec{\upeta}}_{j}^{(i)} (k) - {\varvec{\upeta}}_{m}^{(i)} (k)} \right\|_{2}^{2} } } \\ & \quad - \frac{\delta }{C(C - 1)}\sum\limits_{{{\kern 1pt} i \ne j}}^{C} {W^{\prime}_{i,j} (k)\left\| {{\varvec{\upmu}}^{(i)} (k) - {\varvec{\upmu}}^{(j)} (k)} \right\|_{2}^{2} } \\ & \quad + \lambda \sum\limits_{i,j} {H_{i,j} (k)} \\ \end{aligned} $$

(28)

Proof

It is easy to find that $ G({\mathbf{H}}(k),{\mathbf{H}}(k)) = F({\mathbf{H}}(k)) $. According to Lemma 1, it is only need to show $ G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) \ge F({\mathbf{H}}(k)) $ to prove that $ G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) $ is an auxiliary function of $ F({\mathbf{H}}(k)) $.

Due to the convexity of $ \log \left( {\sum\nolimits_{m} {Z_{i,m} (k)H_{m,j} (k)} } \right) $, the following inequality holds for each non-negative element $ a_{m} $, all of which subject to the condition of $ \sum\nolimits_{m} {a_{m} = 1} $.

$$ - \log \left( {\sum\limits_{m} {Z_{i,m} (k)H_{m,j} (k)} } \right) \le - \sum\limits_{m} {a_{m} \log \frac{{Z_{i,m} (k)H_{m,j} (k)}}{{a_{m} }}} $$

(29)

Assume $ a_{m} = Z_{i,m} (k)H_{m,j}^{(t)} (k) \ \sum\nolimits_{m} Z_{i,m}(k)H_{m,j}^{(t)} (k)$, Eq. (29) can be transformed as follows:

$$ - \log \left( {\sum\limits_{m} {Z_{i,m} (k)H_{m,j} (k)} } \right) \le - \sum\nolimits_{k} {\frac{{Z_{i,m} (k)H_{m,j}^{(t)} (k)}}{{\sum\limits_{k} {Z_{i,m} (k)H_{m,j}^{(t)} (k)} }}\left( {\log \left( {Z_{i,m} (k)H_{m,j} (k)} \right) - \log \frac{{Z_{i,m} (k)H_{m,j}^{(t)} (k)}}{{\sum\nolimits_{m} {Z_{i,m} (k)H_{m,j}^{(t)} (k)} }}} \right)} $$

(30)

From Eq. (30), it is easily observed that $ G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) \ge F({\mathbf{H}}(k)) $. Consequently, $ G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) $ can be viewed as an auxiliary function of $ F({\mathbf{H}}(k)) $.

Proof of Theorem 1

The minimum of $ G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) $ in regard to $ H_{m,l} (k) $ is obtained by setting the gradient to zero:

$$ \begin{aligned} \frac{{\partial G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k))}}{{\partial H_{m,l} (k)}} & = - \sum\nolimits_{i} {{{B}}_{i,l} (k)\frac{{Z_{i,m} (k)H_{m,l}^{(t)} (k)}}{{\sum\nolimits_{n} {Z_{i,n} (k)H_{n,l}^{(t)} (k)} }}} \frac{1}{{H_{m,l} (k)}} \\ & \quad + \sum\nolimits_{i} {Z_{i,m} (k)} + \frac{{4rat_{r} \gamma }}{{N_{r} - 1}}\left( {H_{m,l} (k) - \mu_{m}^{(r)} (k)} \right) \\ & \quad - \frac{{4\delta W^{\prime}_{i,r} \left( k \right)}}{{N_{r} C\left( {C - 1} \right)}}\sum\limits_{i \ne r}^{C} {\left( {\mu_{m}^{(r)} (k) - \mu_{m}^{(i)} (k)} \right)} + \lambda \\ & = 0 \\ \end{aligned} $$

(31)

The above equation is a quadratic equation of $ H_{m,l} (k) $, and by solving it, the iterative rule of Eq. (20) can be obtained. According to Lemma 1, now that $ G({\mathbf{H}}(k),{\mathbf{H}}^{(t)} (k)) $ is an auxiliary function, and then, the function $ F({\mathbf{H}}(k)) $, i.e., $ D_{NEDMF\_SR} $, is nonincreasing under the iterative rule in Eq. (20).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tong, M., Bu, H., Zhao, M. et al. Non-negative enhanced discriminant matrix factorization method with sparsity regularization. Neural Comput & Applic 31, 3117–3140 (2019). https://doi.org/10.1007/s00521-017-3258-3

Download citation

Received: 18 May 2017
Accepted: 14 October 2017
Published: 02 November 2017
Issue Date: 01 July 2019
DOI: https://doi.org/10.1007/s00521-017-3258-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Non-negative enhanced discriminant matrix factorization method with sparsity regularization

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

Sparse semi-supervised multi-label feature selection based on latent representation

A hybrid method based on the completely positive-tensors and PCA for face recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animal rights

Informed consent

Appendix: Proof of Theorem 1

Definition 1

Lemma 1

Proof

Lemma 2

Proof

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Non-negative enhanced discriminant matrix factorization method with sparsity regularization

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

Sparse semi-supervised multi-label feature selection based on latent representation

A hybrid method based on the completely positive-tensors and PCA for face recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animal rights

Informed consent

Appendix: Proof of Theorem 1

Appendix: Proof of Theorem 1

Definition 1

Lemma 1

Proof

Lemma 2

Proof

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation