Nonnegative class-specific entropy component analysis with adaptive step search criterion

Cheng, Miao; Pun, Chi-Man; Tang, Yuan Yan

doi:10.1007/s10044-011-0258-2

Nonnegative class-specific entropy component analysis with adaptive step search criterion

Theoretical Advances
Published: 07 December 2011

Volume 17, pages 113–127, (2014)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Miao Cheng¹,
Chi-Man Pun¹ &
Yuan Yan Tang¹

268 Accesses
5 Citations
Explore all metrics

Abstract

Nonnegative learning aims to learn the part-based representation of nonnegative data and receives much attention in recent years. Nonnegative matrix factorization has been popular to make nonnegative learning applicable, which can also be explained as an optimization problem with bound constraints. In order to exploit the informative components hidden in nonnegative patterns, a novel nonnegative learning method, termed nonnegative class-specific entropy component analysis, is developed in this work. Distinguish from the existing methods, the proposed method aims to conduct the general objective functions, and the conjugate gradient technique is applied to enhance the iterative optimization. In view of the development, a general nonnegative learning framework is presented to deal with the nonnegative optimization problem with general objective costs. Owing to the general objective costs and the nonnegative bound constraints, the diseased nonnegative learning problem usually occurs. To address this limitation, a modified line search criterion is proposed, which prevents the null trap with insured conditions while keeping the feasible step descendent. In addition, the numerical stopping rule is employed to achieve optimized efficiency, instead of the popular gradient-based one. Experiments on face recognition with varieties of conditions reveal that the proposed method possesses better performance over other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonnegative matrix factorization by joint locality-constrained and ℓ 2,1-norm regularization

Article 07 July 2017

Incremental Nonnegative Matrix Factorization with Sparseness Constraint for Image Representation

Quadratic regularization projected Barzilai–Borwein method for nonnegative matrix factorization

Article 12 October 2014

Notes

Note that, for some specific purposes where another existing upper bound is available, it is actually unnecessary to bring such constraint. In NCECA, the NMF approximation is still involved to make an universal arrangement.

References

Belhumeur P, Hespanha J, Kriegman D (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Article Google Scholar
Bertsekas DP (1976) On the Goldstein-Levitin-Polyak gradient projection method. IEEE Trans Autom Control 21:174–184
Article MATH MathSciNet Google Scholar
Bertsekas DP (1999) Nonlinear programming. Athena Scientific, Belmont
MATH Google Scholar
Cheng M, Fang B, Pun CM, Tang YY (2011) Kernel-view based discriminant approach for embedded feature extraction in high-dimensional space. Neurocomputing 74(9):1478–1484
Article Google Scholar
Cheng M, Fang B, Tang YY, Zhang T, Wen J (2010) Incremental embedding and learning in the local discriminant subspace with application to face recognition. IEEE Trans Syst Man Cybern Part C Appl Rev 40(5):580–591
Article Google Scholar
Cheng M, Fang B, Wen J, Tang YY (2010) Marginal discriminant projections: an adaptable marginal discriminant approach to feature reduction and extraction. Pattern Recogn Lett 31(13):1965–1974
Article Google Scholar
Cover TM, Thomas JA (2006) Elements of information theory, 2nd edn. Wiley, New York
Ding C, He X, Simon HD (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceeding of SIAM international conference on data mining, pp 606–610
Erdogmus D, Prineipe JC (2002) Generalized information potential criterion for adaptive system training. IEEE Trans Neural Netw 13(5):1035–1044
Article Google Scholar
Fukunaga K (1990) Introduction to statistical pattern recognition. Academic Press, New York
Georghiades AS, Belhumeur PN, Kriegman DJ (2001) From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans Pattern Anal Mach Intell 23(6):643–660
Article Google Scholar
Guan N, Tao D, Luo Z, Yuan B (2011) Manifold regularized discriminative nonnegative matrix factorization with fast gradient descent. IEEE Trans Image Process 20(7):2030–2048
Article MathSciNet Google Scholar
Guan N, Tao D, Luo Z, Yuan B (2011) Non-negative patch alignment framework. IEEE Trans Neural Netw 22(8):1218–1230
Article Google Scholar
Han L, Neumann M, Prasad U (2009) Alternating projected Barzilai–Borwein methods for nonnegative matrix factorization. Electron Trans Numer Anal 36(6):54–82
MATH MathSciNet Google Scholar
He X, Yan S, Hu Y, Niyogi P, Zhang H (2005) Face recognition using Laplacianfaces. IEEE Trans Pattern Recognit Mach Intell 27(3):328–340
Article Google Scholar
Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469
MATH MathSciNet Google Scholar
Jenssen R (2010) Kernel entropy component analysis. IEEE Trans Pattern Anal Mach Intell 32(5):847–860
Article Google Scholar
Kim D, Sra S, Dhillon IS (2007) Fast newton-type methods for the least squares nonnegative matrix approximation problem. In: IEEE International Conference on Data Mining, pp 343–354
Kotsia I, Zafeiriou S, Pitas I (2007) A novel discriminant non-negative matrix factorization algorithm with applications to facial image characterization problems. IEEE Trans Inform Forensics Secur 2(3):588–595
Article Google Scholar
Laboratory OR.: The Olivetti & Oracle Research Laboratory face database of faces [Online]. Available: http://www.cam-orl.co.uk/facedatabase.html
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791
Article Google Scholar
Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: Proceedings of neural information processing systems, pp 556–562
Leiva-Murillo JM, Artés-Rodríguez A (2007) Maximization of mutual information for supervised linear feature extraction. IEEE Trans Neural Netw 18(5):1433–1441
Article Google Scholar
Li SZ, Hou XW, Zhang HJ (2001) Learning spatially localized, parts-based representation. In: Proceedings of computer vision and pattern recognition, pp 207–212
Lin CJ (2007) On the convergence of multiplicative update for nonnegative matrix factorization. IEEE Trans Neural Netw 18(6):1589–1596
Article Google Scholar
Lin CJ (2007) Projected gradients for nonnegative matrix factorization. Neural Comput 19:2756–2779
Article MATH MathSciNet Google Scholar
Liu C, He K, Zhou J, Zhang J (2010) Generalized discriminant orthogonal non-negative matrix factorization. J Comput Inform Sys 6(6):1743–1750
Google Scholar
Lyons MJ, Akamatsu S, Kamachi M, Gyoba J (2005) Coding facial expression with Gabor wavelets. In: Proceedings of third IEEE international conference on automatic face and gesture recognition, pp 200–205
Moré JJ, Toraldo G (1991) On the solution of large quadratic programming problems with bound constraints. SIAM J Optim 1(1):93–113
Article MATH MathSciNet Google Scholar
Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33:1065–1076
Article MATH MathSciNet Google Scholar
Phillips PJ, Moon H, Rizvi SA, Rauss PJ (2000) The feret evaluation methodology for face-recognition algorithms. IEEE Trans Pattern Anal Mach Intell 22(10):1090–1104
Article Google Scholar
Principe JC, Xu D, Fisher JWI (2000) Information-theoretic learning, vol 1. Wiley, New York
Renyi A (1961) On measures of entropy and information. In: Proceedings of the fourth Berkeley symposium on mathematical statistics and probability, University of California Press, Berkeley, pp 547–561
Torkkola K (2003) Feature extraction by non-parametric mutual information. J Mach Learn Res 3:1415–1438
MATH MathSciNet Google Scholar
Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86
Article Google Scholar
Wang Y, Jia Y, Hu C, Turk M (2005) Non-negative matrix factorization framework for face recognition. Int J Pattern Recogn Artif Intell 19(4):495–511
Article Google Scholar
Zafeiriou S, Tefas A, Buciu I, Pitas I (2006) Exploiting discriminant information in nonnegative matrix factorization with application to frontal face verification. IEEE Trans Neutral Netw 17(3):683–695
Article Google Scholar
Zdunek R, Cichocki A (2006) Non-negative matrix factorization with quasi-newton optimization. In: The 8th international conference on artificial intelligence and soft computing, pp 870–879
Zdunek R, Cichocki A (2007) Nonnegative matrix factorization with constrained second-order optimization. Signal Process 87(8):1904–1916
Article MATH Google Scholar

Download references

Acknowledgments

The authors would like to thank the handling associate editor and anonymous reviewers for their constructive comments. And the authors also would like to thank the US Army Research Laboratory for the FERET database. This work was supported by the research grant funded by the research committee of University of Macau.

Author information

Authors and Affiliations

Department of Computer and Information Science, University of Macau, Taipa, Macau
Miao Cheng, Chi-Man Pun & Yuan Yan Tang

Authors

Miao Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Chi-Man Pun
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Yan Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Miao Cheng.

Appendix: A calculation of $ \nabla J_{E} (W) $

The gradient $ \nabla \mathcal {H} ({W^T X | C}) $ is quite associated with the gradient of the information potential $ \nabla \mathcal {V} ({W^T X | c}) $ for each class. The gradient $ \nabla \mathcal {V} ({W^T X | c}) $ is calculated as

$$ \begin{aligned} &\frac{{\partial \mathcal {V} ({W^{T} X |c})}}{{\partial W}} \\ &= \frac{1}{{n_{c}^{2}}}\frac{{\partial \sum\limits_{i=1}^{n_{c}} {\sum\limits_{j = 1}^{n_{c}} {G ({W^{T} x_{i}^{c}-W^{T} x_{j}^{c},2\sigma^{2}})}}}}{{\partial W}} \\ &= -\frac{1}{{n_{c}^{2} \sigma^{2}}}\sum\limits_{i = 1}^{{n}_{c}} {\sum\limits_{j = 1}^{n_{c}} {G ({W^T x_{i}^{c}-W^{T} x_{j}^{c},2\sigma^{2}}) ({x_{i}^{c}-x_{j}^{c}}) ({x_{i}^{c}- x_{j}^{c}})^{T}W}}. \end{aligned} $$

(36)

Thus, the partial derivative $ {{\partial H ({W^T X | C})} / {\partial W}} $ is given by

$$ \begin{aligned} &\frac{{\partial{\mathcal{H}} ({W^{T}X | C})}}{{\partial W}} \\ &= -\sum\limits_{c=1}^{N} {\frac{{n_{c}}}{n}} \frac{{\frac{{\partial {\mathcal{V}} ({W^{T} X |c})}}{{\partial W}}}}{{{\mathcal{V}} ({W^{T} X | c})}} \\ &= \sum\limits_{c = 1}^N {\frac{{n_{c}}}{{n\sigma^{2}}}} \frac{{\sum\limits_{i = 1}^{n_{c}} {\sum\limits_{j = 1}^{n_{c}} {G ({W^{T} x_{i}^{c}- W^{T} x_{j}^{c},2\sigma^{2}}) ({x_{i}^{c}-x_{j}^{c}}) ({x_{i}^{c}- x_{j}^{c}})^{T} W}}}}{{\sum\limits_{i = 1}^{n_{c}} {\sum\limits_{j = 1}^{n_{c}} {G ({W^{T} x_{i}^{c}-W^{T} x_{j}^{c},2\sigma^{2}})}}}}. \end{aligned} $$

(37)

with the above partial derivatives, it is straightforward to obtain the gradient $ \nabla J_E (W). $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, M., Pun, CM. & Tang, Y.Y. Nonnegative class-specific entropy component analysis with adaptive step search criterion. Pattern Anal Applic 17, 113–127 (2014). https://doi.org/10.1007/s10044-011-0258-2

Download citation

Received: 06 June 2011
Accepted: 17 November 2011
Published: 07 December 2011
Issue Date: February 2014
DOI: https://doi.org/10.1007/s10044-011-0258-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonnegative class-specific entropy component analysis with adaptive step search criterion

Abstract

Access this article

Similar content being viewed by others

Nonnegative matrix factorization by joint locality-constrained and ℓ 2,1-norm regularization

Incremental Nonnegative Matrix Factorization with Sparseness Constraint for Image Representation

Quadratic regularization projected Barzilai–Borwein method for nonnegative matrix factorization

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: A calculation of \( \nabla J_{E} (W) \)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Nonnegative class-specific entropy component analysis with adaptive step search criterion

Abstract

Access this article

Similar content being viewed by others

Nonnegative matrix factorization by joint locality-constrained and ℓ 2,1-norm regularization

Incremental Nonnegative Matrix Factorization with Sparseness Constraint for Image Representation

Quadratic regularization projected Barzilai–Borwein method for nonnegative matrix factorization

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: A calculation of \( \nabla J_{E} (W) \)

Appendix: A calculation of \( \nabla J_{E} (W) \)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation