Elsevier

Pattern Recognition

Volume 60, December 2016, Pages 613-629
Pattern Recognition

Cost-sensitive dictionary learning for face recognition

https://doi.org/10.1016/j.patcog.2016.06.012Get rights and content

Highlights

  • A cost-sensitive dictionary learning algorithm for SRC is proposed.

  • Introduce a new “cost” penalizing matrix during the sparse coding stages.

  • Enforce cost-sensitive requirement throughout the learning process.

  • The learned dictionary is able to produce cost-sensitive sparse coding.

  • Our method can achieve a minimum overall recognition loss.

Abstract

As one of the most popular research topics, sparse representation and dictionary learning technique has received an increasing amount of interest in recent years. Sparse representation based classification (SRC) has been shown to be an effective method and produce impressive performance on face recognition. SRC directly used the entire set of training samples as the dictionary for sparse coding. Recent research has shown that learning a dictionary from the training samples instead of using a predefined one can produce state-of-the-art results. However, all of these dictionary learning methods are designed to achieve low classification errors and implicitly assumes that the losses of all misclassification are the same. In many real-world face recognition applications, this assumption may not hold as different misclassifications could lead to different losses. Motivated by this concern, in this paper we propose a cost-sensitive dictionary learning algorithm for SRC, by which the designed dictionary is able to produce cost-sensitive sparse coding, resulting in improved classification performance in such scenarios. Our method considers the cost information during the sparse coding stages. Specifically, we introduce a new “cost” penalizing matrix and enforce the cost-sensitive requirement throughout the learning process. The optimal solution is efficiently obtained following the alternative optimization method. Experimental results demonstrate the effectiveness of the proposed method.

Introduction

Face recognition is a challenging computer vision task that has been studied over 30 years [1].Many successful face recognition systems have been developed, such as Eigenfaces based on Principal component analysis (PCA) [2], Fisherfaces based on Linear Discriminate Analysis (LDA) [3] and Laplacianfaces using locality preserving projection (LPP) [4]. Those methods usually involve two stages: feature extraction and classification. Recently, sparse representation technique has been applied in a variety of applications in computer vision and pattern recognition [5], [6], [7], [8], [9]. Wright et al. [10] developed a sparse representation based classification (SRC) and obtained promising results on face recognition. In SRC, a testing image was encoded by sparse linear combination of all the training samples and classified into the class with minimum sparse reconstruction residual. Unlike conventional methods such as Eigenfaces and Fisherfaces, SRC does not need an explicit feature extraction stage.

As well known, Wright et al. [10] directly employ entire training samples as a dictionary for discriminative sparse coding. However, it has been demonstrated that learning a dictionary from original training samples instead of using a predefined one such as wavelets [11], can lead to much better results [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22]. In [14], a dictionary learning algorithm, K-SVD, is introduced that generalizes k-means clustering and efficiently learns an overcomplete dictionary from a set of training samples. Lee et al. [15] proposed an efficient reconstructive dictionary learning method which shows promising results in self-taught learning tasks [16] and image categorization [17]. Yang et al. [18] proposed a metaface learning (MFL) algorithm to represent training samples by a series of “metafaces” learned from each class. Based on K-SVD, Zhang et al. [19] developed d-KSVD algorithm by simultaneously learning a linear classifier. Jiang et al. [20], [21] introduced a label consistent regularization to enforce the discrimination of coding vectors. The so-called LC-KSVD algorithm exhibits good classification results. Yang et al. [22] proposed a Fisher discrimination dictionary learning method, where the category-specific strategy is adopted for learning a structural dictionary and the Fisher discrimination criterion is imposed on the coding vectors to enhance class discrimination. Moreover, there are many other efforts which have been devoted to the learning of a proper dictionary for particular applications, i.e. image denoising [23], [24], image inpainting [25], and image classification [26], [27], [28], [29], [30].

However, current sparse representation and dictionary learning based methods only target at low recognition errors and implicitly assume that the losses of all misclassification are the same. Although this assumption is widely taken, we argue that it is not really reasonable because, for most real-world applications, different kinds of mistakes generally lead to different amounts of losses. For example, consider a door locker based on a face recognition system for a certain group (e.g., family members or company employees); it may cause inconvenience to a gallery person who is misrecognized as an impostor and not allowed to enter the room, but may result in a serious loss or damage if an impostor is misrecognized as a gallery person and allowed to enter the room. From the example above, clearly we can conclude that face recognition is a cost-sensitive pattern classification problem, which has been neglected by most existing face recognition algorithm.

Cost-sensitive learning is one important topic in the data mining and machine learning community [31], [32], [33], [34], [35]. In such settings, cost information is introduced to measure the importance of different samples in different classes, and different costs reflect different amounts of losses. The purpose of cost-sensitive learning is to minimize the total cost rather than total error. Generally, there are two kinds of misclassification cost. The first is class-dependent, where the costs of misclassifying any example in class A to class B are the same. The second is example-dependent, where the costs of classifying examples in class A to class B are different. In this paper, we focus on the former one because face recognition is generally a class-dependent cost-sensitive problem.

There have been several cost-sensitive learning algorithms proposed in the literature. Such as cost-sensitive boosting [32], [34], cost-sensitive SVM [31], cost-sensitive semi-supervised learning [31], and cost-sensitive neural networks [33]. Zhang et al. [35] presented a multiclass cost-sensitive learning framework for face recognition which aims at minimizing the total loss of misclassifications instead of classification errors. Lu et al. [36], [37] introduced the cost information into four popular and widely used linear subspace learning algorithms and devised the corresponding cost-sensitive methods, namely CSPCA, CSLDA, CSLPP and CSMFA. Lu et al. [38] proposed a cost-sensitive semi-supervised discriminant analysis method by making use of both labeled and unlabeled sample and exploring different cost information of all the training samples simultaneously. To our best knowledge, [39] should be the first work that formally introduced the cost-sensitive idea into sparse representation and presented a sparse cost-sensitive classifier (SCS-C), SCS-C utilizes the probabilistic model of sparse representation to estimate the posterior probabilities of a testing samples, and calculates all the misclassification losses via the posterior probabilities. Finally, the testing sample is assigned to the class with minimal loss. Note that, SCS-CS uses all the training samples to form a dictionary for sparse representation. However, original training samples may contain some redundant information, noise or even other trivial information that obstructs the correct recognition. Intuitively, a more accurate and discriminative representation can be obtained if we could optimize a dictionary from original training samples. Motivated by the above concerns, in this paper we propose a novel cost-sensitive dictionary learning approach for sparse representation based classification. Our method considers the cost information during sparse coding stage. We introduce a new “cost” penalizing matrix and enforce the cost-sensitive constraint throughout the learning process. The learned dictionary which is able to produce cost-sensitive sparse coding and encourages the samples from the same class to have similar sparse codes and those from different classes to have dissimilar sparse codes. To our best knowledge, our work is the first attempt to introduce cost information into dictionary learning technique.

The rest of this paper is organized as follows: Section 2, reviews related works on sparse representation and dictionary learning. In Section 3, we formulate the cost-sensitive learning for face recognition, and then we present our proposed algorithm for cost-sensitive dictionary learning in Section 4. Experimental results on real image datasets are given in Section 5. Section 6 concludes the paper.

Section snippets

Related work

In this section, we review briefly some related works on sparse representation based classification and dictionary learning.

Cost-sensitive learning for face recognition

We can write X as X=[x1,x2,,xs,xs+1,,xn], where xiRm and m is the feature dimension of each sample, 1in. Let the first s samples XM=[x1,x2,,xs] be the gallery subjects with their class labels {Gi}i=1,,M and let the remaining samples XI=[xs+1,,xn] be impostor subjects with the class labels {Ii}i=1,,L. In this paper we regarded those impostors as a metaclass with label I. Because face recognition is generally a class-dependent cost-sensitive problem, there are usually three types of

Formulation with cost sensitive

We aim to leverage the cost information of training samples to learn a cost-sensitive dictionary. Each atom in dictionary will be chosen so that it represents a subset of training samples ideally from a single class. Thus each dictionary atom can be associated with a particular label. Clearly, there is an explicit correspondence between dictionary atoms and labels in our approach. The objective function of CSDL can be formulated as follows:minD,ΛXDΛF2+λ1QΛF2+λ2i=1nα1s.t.dk2=1,k=1,,K

Experiments

In this section, we perform extensive experiments to verify the performance of our method CS-SRC on five publicly available face databases, i.e., Extended Yale B [46], AR [47], FERET [48], CMU PIE [49] and Yale databases. In order to clearly illustrate the advantage of the proposed method, we first demonstrate (using ROC curves) the effectiveness of sparsity as a mean of validating test images compared with SRC [10] and Bayesian Fusion-based image Selection and Recognition (BF-SR) [50].

Conclusion

In this paper, we proposed a new dictionary learning approach, called cost-sensitive dictionary learning algorithm for SRC (CS-SRC). Our main contribution is to integrate the “cost” penalizing matrix into the objective function for dictionary learning. Additionally, the solution to the new objective function is efficiently achieved by employing the alternating optimization method. Unlike most existing dictionary learning approaches which do not take the cost information into account, our

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grants 61273251 and 61401209, in part by the Natural Science Foundation of Jiangsu Province, China under Grant BK20140790, in part by China Postdoctoral Science Foundation under Grants 2014T70525 and 2013M531364, in part by Graduate Innovation Project of Jiangsu Province under Grant KYLX_0380, in part by Project of Civil Space Technology Preresearch of the 12th Five-Year Plan, China and in part by National Natural

Guoqing Zhang received his B.S. and Master degrees from the School of Information Engineering at the Yangzhou University in 2009 and 2012. He is currently a Ph.D. Candidate in the school of Computer Science and Engineering, Nanjing University of Science and Technology, China. His research interests include pattern recognition, machine learning, image processing, and computer vision.

References (53)

  • O. Bryt et al.

    Compression of facial images using the K-SVD algorithm

    J. Vis. Image Represent.

    (2008)
  • Y. Sun et al.

    Cost-sensitive boosting for classification of imbalanced data

    Pattern Recognit.

    (2007)
  • W. Zhao et al.

    Face recognition: a literature survey

    ACM Comput. Surveys (CSUR)

    (2003)
  • M. Turk et al.

    Eigenfaces for recognition

    J. Cognit. Neurosci.

    (1991)
  • P.N. Belhumeur et al.

    Eigenface vs. fisherfaces: recognition using class specific linear projection

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1997)
  • X. He, P. Niyogi, Locality preserving projections, in: Proceedings of the NIPS,...
  • J. Wright, Y. Ma, J. Mairal, G. Sapiro, T. Huang, S. Yan, Sparse representation for computer vision and pattern...
  • M. Elad et al.

    Image denoising via sparse and redundant representations over learned dictionaries

    IEEE Trans. Image Process.

    (2006)
  • O. Bryt, M. Elad, Improving the k-svd facial image compressionusing a linear deblocking method, in: Proceedings of the...
  • J. Mairal et al.

    Sparse representation for color image restoration

    IEEE Trans. Image Process.

    (2008)
  • J. Wright et al.

    Robust face recognition via sparse representation

    IEEE Trans. Pattern Analy. Mach. Intell.

    (2009)
  • S. Mallat, A wavelet tour of signal processing,...
  • J. Mairal et al.

    Task-driven dictionary learning

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2012)
  • J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, Y. Gong. Locality-constrained linear coding for image classification, in:...
  • M. Aharon et al.

    K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation

    IEEE Trans. Signal Process.

    (2006)
  • H. Lee, A. Battle, R. Raina, A.Y. Ng, Efficient sparse coding algorithms, in: Advances in Neural Information Processing...
  • R. Raina, A. Battle, H. Lee, B. Packer, A.Y. Ng, Self-taught learning: transfer learning from unlabeled data, in:...
  • J. Yang, K. Yu, Y. Gong, T. Huang, Linear spatial pyramid matching using sparse coding for image classification, in:...
  • M. Yang, L. Zhang, J. Yang, D. Zhang, Metaface learning for sparse representation based face recognition, in:...
  • Q. Zhang, B. Li, Discriminative K-SVD for dictionary learning in face recognition, in: Proceedings of IEEE Conference...
  • Z. Jiang, Z. Lin, L. Davis, Learning a discriminative dictionary for sparse coding via label consistent K-SVD, in:...
  • Z. Jiang et al.

    Label consistent K-SVD: learning a discriminative dictionary for recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2013)
  • M. Yang, L. Zhang, X. Feng, D. Zhang, Fisher discrimination dictionary learning for sparse representation, in:...
  • M. Elad et al.

    Image denoising via sparse and redundant representations over learned dictionaries

    IEEE Trans. Image Process.

    (2006)
  • J. Mairal et al.

    Sparse representation for color image restoration

    IEEE Trans. Image Process.

    (2008)
  • J. Mairal, F. Bach, J. Ponce, G. Sapiro, Online dictionary learning for sparse coding, in: Proceedings of the 26th...
  • Cited by (37)

    • Angle-based cost-sensitive multicategory classification

      2021, Computational Statistics and Data Analysis
      Citation Excerpt :

      This implicitly presumes that all types of misclassification errors have equal costs, which finally leads to cost-insensitive classifiers. In fact, many real-world classification problems are cost-sensitive, such as fraud detection (Sahin et al., 2013; Nami and Shajari, 2018), medical diagnosis (Yang et al., 2009; Park et al., 2011) and face recognition (Zhang and Zhou, 2010; Zhang et al., 2016b). In these practical applications, the costs of different types of misclassification errors could be vastly different (Sun et al., 2007).

    • Cost-sensitive joint feature and dictionary learning for face recognition

      2020, Neurocomputing
      Citation Excerpt :

      It aims to minimize total cost rather than total error. Face recognition is generally a cost-sensitive learning problem and many successful cost-sensitive face recognition algorithms have been developed [35–40]. Zhang et al. [38] formulated face recognition problem as a multiclass cost-sensitive learning task and proposed two cost-sensitive classification methods.

    • CGAN and SVM Based Transformer Fault Acoustic Signal Synthesis Technology

      2023, IEEE Joint International Information Technology and Artificial Intelligence Conference (ITAIC)
    View all citing articles on Scopus

    Guoqing Zhang received his B.S. and Master degrees from the School of Information Engineering at the Yangzhou University in 2009 and 2012. He is currently a Ph.D. Candidate in the school of Computer Science and Engineering, Nanjing University of Science and Technology, China. His research interests include pattern recognition, machine learning, image processing, and computer vision.

    Huaijiang Sun was born in 1968. He received his B.Eng, and Ph.D. degrees in the School of Marine Engineering, Northwestern Polytechnical University, Xi’an, China, in 1990 and 1995, respectively. He is currently a Professor in the Department of Computer Science and Engineering at Nanjing University of Science and Technology. His research interests include computer vision and pattern recognition, image and video processing and intelligent information processing.

    Zexuan Ji received B.E. in Computer Science and Ph.D. degrees in Pattern Recognition and Intelligence System from Nanjing University of Science and Technology, Nanjing, China, in 2007 and 2012, respectively. He is currently a postdoctoral research fellow in the School of Computer Science and Engineering at the Nanjing University of Science and Technology. His research interests include medical imaging, image processing and pattern recognition.

    Yun-Hao Yuan received the M. Eng. degree in computer science and technology from Yangzhou University, China, in 2009, and the Ph.D. degree in pattern recognition and intelligence system from Nanjing University of Science and Technology (NUST), China, in 2013. He is an Associate Professor in the School of Internet of Things Engineering, Jiangnan University (JNU), China. He has published more than 20 scientific papers. He is currently a member of International Society of Information Fusion (ISIF). His research interests include pattern recognition, image processing, computer vision, and information fusion.

    Quansen Sun received his Ph.D. in Pattern Recognition and Intelligence System from Nanjing University of Science and Technology, Nanjing, China, in 2006. He is currently a professor in the School of Computer Science and Engineering at the Nanjing University of Science and Technology. His research interests include pattern recognition, image processing, computer vision and data fusion.

    View full text