Cost-sensitive dictionary learning for face recognition
Introduction
Face recognition is a challenging computer vision task that has been studied over 30 years [1].Many successful face recognition systems have been developed, such as Eigenfaces based on Principal component analysis (PCA) [2], Fisherfaces based on Linear Discriminate Analysis (LDA) [3] and Laplacianfaces using locality preserving projection (LPP) [4]. Those methods usually involve two stages: feature extraction and classification. Recently, sparse representation technique has been applied in a variety of applications in computer vision and pattern recognition [5], [6], [7], [8], [9]. Wright et al. [10] developed a sparse representation based classification (SRC) and obtained promising results on face recognition. In SRC, a testing image was encoded by sparse linear combination of all the training samples and classified into the class with minimum sparse reconstruction residual. Unlike conventional methods such as Eigenfaces and Fisherfaces, SRC does not need an explicit feature extraction stage.
As well known, Wright et al. [10] directly employ entire training samples as a dictionary for discriminative sparse coding. However, it has been demonstrated that learning a dictionary from original training samples instead of using a predefined one such as wavelets [11], can lead to much better results [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22]. In [14], a dictionary learning algorithm, K-SVD, is introduced that generalizes k-means clustering and efficiently learns an overcomplete dictionary from a set of training samples. Lee et al. [15] proposed an efficient reconstructive dictionary learning method which shows promising results in self-taught learning tasks [16] and image categorization [17]. Yang et al. [18] proposed a metaface learning (MFL) algorithm to represent training samples by a series of “metafaces” learned from each class. Based on K-SVD, Zhang et al. [19] developed d-KSVD algorithm by simultaneously learning a linear classifier. Jiang et al. [20], [21] introduced a label consistent regularization to enforce the discrimination of coding vectors. The so-called LC-KSVD algorithm exhibits good classification results. Yang et al. [22] proposed a Fisher discrimination dictionary learning method, where the category-specific strategy is adopted for learning a structural dictionary and the Fisher discrimination criterion is imposed on the coding vectors to enhance class discrimination. Moreover, there are many other efforts which have been devoted to the learning of a proper dictionary for particular applications, i.e. image denoising [23], [24], image inpainting [25], and image classification [26], [27], [28], [29], [30].
However, current sparse representation and dictionary learning based methods only target at low recognition errors and implicitly assume that the losses of all misclassification are the same. Although this assumption is widely taken, we argue that it is not really reasonable because, for most real-world applications, different kinds of mistakes generally lead to different amounts of losses. For example, consider a door locker based on a face recognition system for a certain group (e.g., family members or company employees); it may cause inconvenience to a gallery person who is misrecognized as an impostor and not allowed to enter the room, but may result in a serious loss or damage if an impostor is misrecognized as a gallery person and allowed to enter the room. From the example above, clearly we can conclude that face recognition is a cost-sensitive pattern classification problem, which has been neglected by most existing face recognition algorithm.
Cost-sensitive learning is one important topic in the data mining and machine learning community [31], [32], [33], [34], [35]. In such settings, cost information is introduced to measure the importance of different samples in different classes, and different costs reflect different amounts of losses. The purpose of cost-sensitive learning is to minimize the total cost rather than total error. Generally, there are two kinds of misclassification cost. The first is class-dependent, where the costs of misclassifying any example in class A to class B are the same. The second is example-dependent, where the costs of classifying examples in class A to class B are different. In this paper, we focus on the former one because face recognition is generally a class-dependent cost-sensitive problem.
There have been several cost-sensitive learning algorithms proposed in the literature. Such as cost-sensitive boosting [32], [34], cost-sensitive SVM [31], cost-sensitive semi-supervised learning [31], and cost-sensitive neural networks [33]. Zhang et al. [35] presented a multiclass cost-sensitive learning framework for face recognition which aims at minimizing the total loss of misclassifications instead of classification errors. Lu et al. [36], [37] introduced the cost information into four popular and widely used linear subspace learning algorithms and devised the corresponding cost-sensitive methods, namely CSPCA, CSLDA, CSLPP and CSMFA. Lu et al. [38] proposed a cost-sensitive semi-supervised discriminant analysis method by making use of both labeled and unlabeled sample and exploring different cost information of all the training samples simultaneously. To our best knowledge, [39] should be the first work that formally introduced the cost-sensitive idea into sparse representation and presented a sparse cost-sensitive classifier (SCS-C), SCS-C utilizes the probabilistic model of sparse representation to estimate the posterior probabilities of a testing samples, and calculates all the misclassification losses via the posterior probabilities. Finally, the testing sample is assigned to the class with minimal loss. Note that, SCS-CS uses all the training samples to form a dictionary for sparse representation. However, original training samples may contain some redundant information, noise or even other trivial information that obstructs the correct recognition. Intuitively, a more accurate and discriminative representation can be obtained if we could optimize a dictionary from original training samples. Motivated by the above concerns, in this paper we propose a novel cost-sensitive dictionary learning approach for sparse representation based classification. Our method considers the cost information during sparse coding stage. We introduce a new “cost” penalizing matrix and enforce the cost-sensitive constraint throughout the learning process. The learned dictionary which is able to produce cost-sensitive sparse coding and encourages the samples from the same class to have similar sparse codes and those from different classes to have dissimilar sparse codes. To our best knowledge, our work is the first attempt to introduce cost information into dictionary learning technique.
The rest of this paper is organized as follows: Section 2, reviews related works on sparse representation and dictionary learning. In Section 3, we formulate the cost-sensitive learning for face recognition, and then we present our proposed algorithm for cost-sensitive dictionary learning in Section 4. Experimental results on real image datasets are given in Section 5. Section 6 concludes the paper.
Section snippets
Related work
In this section, we review briefly some related works on sparse representation based classification and dictionary learning.
Cost-sensitive learning for face recognition
We can write as , where and is the feature dimension of each sample, . Let the first samples be the gallery subjects with their class labels and let the remaining samples be impostor subjects with the class labels . In this paper we regarded those impostors as a metaclass with label . Because face recognition is generally a class-dependent cost-sensitive problem, there are usually three types of
Formulation with cost sensitive
We aim to leverage the cost information of training samples to learn a cost-sensitive dictionary. Each atom in dictionary will be chosen so that it represents a subset of training samples ideally from a single class. Thus each dictionary atom can be associated with a particular label. Clearly, there is an explicit correspondence between dictionary atoms and labels in our approach. The objective function of CSDL can be formulated as follows:
Experiments
In this section, we perform extensive experiments to verify the performance of our method CS-SRC on five publicly available face databases, i.e., Extended Yale B [46], AR [47], FERET [48], CMU PIE [49] and Yale databases. In order to clearly illustrate the advantage of the proposed method, we first demonstrate (using ROC curves) the effectiveness of sparsity as a mean of validating test images compared with SRC [10] and Bayesian Fusion-based image Selection and Recognition (BF-SR) [50].
Conclusion
In this paper, we proposed a new dictionary learning approach, called cost-sensitive dictionary learning algorithm for SRC (CS-SRC). Our main contribution is to integrate the “cost” penalizing matrix into the objective function for dictionary learning. Additionally, the solution to the new objective function is efficiently achieved by employing the alternating optimization method. Unlike most existing dictionary learning approaches which do not take the cost information into account, our
Acknowledgments
This work was supported by the National Natural Science Foundation of China under Grants 61273251 and 61401209, in part by the Natural Science Foundation of Jiangsu Province, China under Grant BK20140790, in part by China Postdoctoral Science Foundation under Grants 2014T70525 and 2013M531364, in part by Graduate Innovation Project of Jiangsu Province under Grant KYLX_0380, in part by Project of Civil Space Technology Preresearch of the 12th Five-Year Plan, China and in part by National Natural
Guoqing Zhang received his B.S. and Master degrees from the School of Information Engineering at the Yangzhou University in 2009 and 2012. He is currently a Ph.D. Candidate in the school of Computer Science and Engineering, Nanjing University of Science and Technology, China. His research interests include pattern recognition, machine learning, image processing, and computer vision.
References (53)
- et al.
Compression of facial images using the K-SVD algorithm
J. Vis. Image Represent.
(2008) - et al.
Cost-sensitive boosting for classification of imbalanced data
Pattern Recognit.
(2007) - et al.
Face recognition: a literature survey
ACM Comput. Surveys (CSUR)
(2003) - et al.
Eigenfaces for recognition
J. Cognit. Neurosci.
(1991) - et al.
Eigenface vs. fisherfaces: recognition using class specific linear projection
IEEE Trans. Pattern Anal. Mach. Intell.
(1997) - X. He, P. Niyogi, Locality preserving projections, in: Proceedings of the NIPS,...
- J. Wright, Y. Ma, J. Mairal, G. Sapiro, T. Huang, S. Yan, Sparse representation for computer vision and pattern...
- et al.
Image denoising via sparse and redundant representations over learned dictionaries
IEEE Trans. Image Process.
(2006) - O. Bryt, M. Elad, Improving the k-svd facial image compressionusing a linear deblocking method, in: Proceedings of the...
- et al.
Sparse representation for color image restoration
IEEE Trans. Image Process.
(2008)
Robust face recognition via sparse representation
IEEE Trans. Pattern Analy. Mach. Intell.
Task-driven dictionary learning
IEEE Trans. Pattern Anal. Mach. Intell.
K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation
IEEE Trans. Signal Process.
Label consistent K-SVD: learning a discriminative dictionary for recognition
IEEE Trans. Pattern Anal. Mach. Intell.
Image denoising via sparse and redundant representations over learned dictionaries
IEEE Trans. Image Process.
Sparse representation for color image restoration
IEEE Trans. Image Process.
Cited by (37)
Angle-based cost-sensitive multicategory classification
2021, Computational Statistics and Data AnalysisCitation Excerpt :This implicitly presumes that all types of misclassification errors have equal costs, which finally leads to cost-insensitive classifiers. In fact, many real-world classification problems are cost-sensitive, such as fraud detection (Sahin et al., 2013; Nami and Shajari, 2018), medical diagnosis (Yang et al., 2009; Park et al., 2011) and face recognition (Zhang and Zhou, 2010; Zhang et al., 2016b). In these practical applications, the costs of different types of misclassification errors could be vastly different (Sun et al., 2007).
Optimal discriminative feature and dictionary learning for image set classification
2021, Information SciencesCost-sensitive joint feature and dictionary learning for face recognition
2020, NeurocomputingCitation Excerpt :It aims to minimize total cost rather than total error. Face recognition is generally a cost-sensitive learning problem and many successful cost-sensitive face recognition algorithms have been developed [35–40]. Zhang et al. [38] formulated face recognition problem as a multiclass cost-sensitive learning task and proposed two cost-sensitive classification methods.
DisP+V: A Unified Framework for Disentangling Prototype and Variation From Single Sample per Person
2023, IEEE Transactions on Neural Networks and Learning SystemsCGAN and SVM Based Transformer Fault Acoustic Signal Synthesis Technology
2023, IEEE Joint International Information Technology and Artificial Intelligence Conference (ITAIC)
Guoqing Zhang received his B.S. and Master degrees from the School of Information Engineering at the Yangzhou University in 2009 and 2012. He is currently a Ph.D. Candidate in the school of Computer Science and Engineering, Nanjing University of Science and Technology, China. His research interests include pattern recognition, machine learning, image processing, and computer vision.
Huaijiang Sun was born in 1968. He received his B.Eng, and Ph.D. degrees in the School of Marine Engineering, Northwestern Polytechnical University, Xi’an, China, in 1990 and 1995, respectively. He is currently a Professor in the Department of Computer Science and Engineering at Nanjing University of Science and Technology. His research interests include computer vision and pattern recognition, image and video processing and intelligent information processing.
Zexuan Ji received B.E. in Computer Science and Ph.D. degrees in Pattern Recognition and Intelligence System from Nanjing University of Science and Technology, Nanjing, China, in 2007 and 2012, respectively. He is currently a postdoctoral research fellow in the School of Computer Science and Engineering at the Nanjing University of Science and Technology. His research interests include medical imaging, image processing and pattern recognition.
Yun-Hao Yuan received the M. Eng. degree in computer science and technology from Yangzhou University, China, in 2009, and the Ph.D. degree in pattern recognition and intelligence system from Nanjing University of Science and Technology (NUST), China, in 2013. He is an Associate Professor in the School of Internet of Things Engineering, Jiangnan University (JNU), China. He has published more than 20 scientific papers. He is currently a member of International Society of Information Fusion (ISIF). His research interests include pattern recognition, image processing, computer vision, and information fusion.
Quansen Sun received his Ph.D. in Pattern Recognition and Intelligence System from Nanjing University of Science and Technology, Nanjing, China, in 2006. He is currently a professor in the School of Computer Science and Engineering at the Nanjing University of Science and Technology. His research interests include pattern recognition, image processing, computer vision and data fusion.