Supervised within-class-similar discriminative dictionary learning for face recognition☆
Introduction
With the development of sparse-coding algorithms on L0- or L1-norm optimizations [1], [2], [3], [4], [5], [6], [7], [8], [9], the sparse-representation-based technologies have been widely accepted on a variety of tasks in image processing, such as image restoration and compressive sensing [10], [11], [12]. Recently, the sparse-based representation is extended to recognition tasks in [13] and referred to as sparse-representation-based classification (SRC). SRC achieves encouraging results on face recognition [13], [14], digit recognition [15], [16], and disease diagnosis [17]. SRC assumes that each test sample can be expressed as a sparse linear combination of the atoms in the dictionary columns, where the resulting expression is assigned the class label with the minimum representation residual. For some SRC applications, especially in the early phases, all of the training samples are viewed as a dictionary. This practice, however, makes the sparse representation computationally time-consuming when facing large training sets [18]. Thus, dictionary learning (DL) algorithms for constructing small but representative dictionaries from the training samples have been proposed and investigated.
To derive a compact dictionary, a variety of DL algorithms have been developed and introduced [19], [20], [21]. The oldest DL algorithm is the method of optimal direction (MOD) [20]. Another popular DL method is K-SVD, which generalizes the K-means clustering process and updates the dictionary atoms while updating the sparse-coding to minimize the reconstruction error [19]. In [21], a meta-face-learning method was put forward to update the dictionary atoms with a closed-form solution. These DL algorithms are mainly developed to seek out an optimal dictionary to represent the data with minimum reconstruction error. They do not use the class label information of the data in the training sets. Thus these DL approaches can be referred to as the unsupervised DL. The dictionary or the representation coefficients learned from these DL algorithms may be effective for signal reconstruction but may not be discriminative enough for classification tasks.
For deriving discriminative dictionaries that are suitable for classification tasks, a number of supervised DL algorithms utilizing the class label information of the training samples have been developed. There are two categories of supervised DL methods in general. The first category consists of those constructing one dictionary for each class [16], [22], [23], [24], [25], [26]. For example, Ramirez et al. developed a DL algorithm with structured incoherence (DLSI) for the construction of class-specific dictionaries [16]. Yang et al. proposed a fisher discrimination DL (FDDL) algorithm to derive dictionaries for each class using restrictions on minimum within-class scatter and maximum between-class scatter [24]. Ma et al. put forward a discriminative low-rank DL method for sparse representation with low-rank constraint of the class-specific dictionaries [27]. Li et al. developed a discriminative DL algorithm with low-rank regularization and fisher discriminant function [28]. All of these methods can obtain one dictionary for each class and then identify the test samples with minimum reconstruction error. This category of supervised DL can derive representative dictionaries for each class, but the dictionary updating process and the sparse-coding process are complex if there are a large number of classes.
The second category of supervised DL algorithms derives one dictionary shared by all of the classes. In this category, the algorithms mainly incorporate discriminative terms into the objective function to obtain a good recognition rate [18], [29], [30], [31], [32], [33]. For example, the linear classification error is used as the discriminative term in [18], [31], [32], [33]. Other researchers adopt the logistic loss term in the objective function [30], [33]. Furthermore, Huang and Aviyente applied the fisher ratio between inter-class distance and within-class scatter as the discriminative constraint on the sparse coefficients [29]. These algorithms can be further spilt into two types: one directly incorporates the classification model (e.g., linear loss, logistic loss or hinge loss function) into the DL framework to obtain the simultaneously derived classifier for classifying test samples [18], [30], [31], [32], [33]; the other type imposes discriminative constraints on the representation coefficients (e.g., fisher criteria) and then employs the coefficients to construct a classifier for identifying test samples [29]. Thus, would the combination of classification models with restrictions on the representation coefficients further improve the discriminative power of the dictionary and enhance the recognition rate? Jiang et al. put forward an algorithm named LC-KSVD [18], which adds the classification error term and the discriminative sparse-code error term into the objective function. This algorithm achieves improved performance compared to those algorithms without the discriminative sparse-code error term, such as the D-KSVD [32] and K-SVD [19] algorithms. The discriminative sparse-code error term of label consistency regularization in LC-KSVD can force signals from the same class to have similar representations [18]. This indirect promotion of within-class similarity between coding coefficients from every class has been shown to improve the recognition rate of the dictionary. Thus, one question can be posed: can the direct restriction of coefficients to be similar within one class, combined with the linear classification error term, result in a more discriminative dictionary and further improve the recognition rate?
In this paper, we propose to directly restrict the within-class scatter of a dictionary’s representation coefficients while simultaneously deriving the linear classification error term in a supervised DL scheme, referred to as the supervised within-class-similar discriminative DL (SCDDL) algorithm. We examine the SCDDL performance using three well-known face databases, the Extended YaleB database [34], the AR face database [35] and the CMU PIE face database [36]. The results show that SCDDL can achieve superior recognition rates in comparison to LC-KSVD [18] and some other state-of-the-art DL algorithms.
The novel contributions of this paper are threefold. First, we propose to directly impose restrictions on within-class similarity of representation coefficients simultaneously with the derivation of the linear classification error in the supervised DL algorithm, to construct a more discriminative dictionary. Second, we demonstrate that directly adding the constraint on the within-class scatter term in the DL scheme can enhance the fisher criterion (the ratio of between-class scatter and within-class scatter) more than indirect constraint or no constraint (LC-KSVD, D-KSVD). Third, we show that the combination of directly restricted within-class scatter and a simultaneously derived classifier is a powerful tool for face recognition tasks, compared with several other state-of-the-art DL algorithms.
The organization of the rest paper is as follows. Section 2 introduces the DL-related algorithms. Section 3 describes the proposed SCDDL model. Section 4 presents the experimental results and Section 5 concludes the paper.
Section snippets
Sparse representation-based classification
Sparse-representation-based classification (SRC) was put forward by Wright et al. for face recognition [13]. Suppose n d-dimensional training samples from k classes can be denoted as , where consists of training samples from class . When A is used directly as the dictionary (see below for a dictionary learning framework for constructing more compact dictionaries), then for a test sample , the SRC framework can be as follows:
(i) Sparse-coding of y
Supervised within-class-similar discriminative dictionary learning
A novel DL scheme named supervised within-class-similar discriminative dictionary learning (SCDDL) is proposed in this section to create a discriminative dictionary for classification.
Databases
Three face databases including the Extended YaleB database [34], the AR face database [35] and the CMU PIE face database [36] are used to evaluate SCDDL in this study.
Extended YaleB [34]: The Extended YaleB database including varying illumination conditions and expressions consists 2414 face images from 38 subjects, with approximately 64 images for each subject. Sample images of two subjects are shown in Fig. 1. These images were cropped to 192 × 168 pixels . Half of the images are
Conclusions
This paper proposes a new DL algorithm for face recognition named supervised within-class-similar discriminative dictionary learning (SCDDL). The main contribution is the combination of direct restriction of the within-class-similar term of representation coefficients with the simultaneous restriction of linear classification error in the objective function, which is a kind of combination of FDDL and D-KSVD or LC-KSVD. Notably, the direct restriction of the within-class similar term, or the
Acknowledgements
This work is supported by the 863 Program (2015AA020912), the Funds for International Cooperation and Exchange of the National Natural Science Foundation of China (61210001), the General Program of National Natural Science Foundation of China (61571047) and the Fundamental Research Funds for the Central Universities.
References (43)
- et al.
Ensemble sparse classification of Alzheimer’s disease
NeuroImage
(2012) - et al.
Bilinear discriminative dictionary learning for face recognition
Pattern Recogn.
(2014) - et al.
Discriminative dictionary learning with low-rank regularization for face recognition
- et al.
Compression of facial images using the K-SVD algorithm
J. Vis. Commun. Image Represent.
(2008) - et al.
Integration of multi-feature fusion and dictionary learning for face recognition
Image Vis. Comput.
(2013) - J. Mairal, SPAMS: Sparse Modeling Software, Version 2.0,...
- et al.
A fast iterative shrinkage-thresholding algorithm for linear inverse problems
SIAM J. Imag. Sci.
(2009) - E. Candes, J. Romberg, l1-magic: Recovery of Sparse Signals via Convex Programming, vol. 4, 2005, p. 14....
- et al.
Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems
IEEE J. Sel. Topics Signal Process.
(2007) - et al.
A Fixed-point Continuation Method for l1-regularized Minimization with Applications to Compressed Sensing, CAAM TR07-07
(2007)
Fast Sparse Representation based on Smoothed ℓ0 Norm
Fast ℓ1-minimization algorithms and an application in robust face recognition: A review
L1 ls: A Matlab Solver for Large-scale l1-regularized Least Squares Problems
Sparse representation for color image restoration
IEEE Trans. Image Process.
Centralized sparse representation for image restoration
An introduction to compressive sampling
Signal Process. Mag., IEEE
Robust face recognition via sparse representation
IEEE Trans. Pattern Anal. Mach. Intell.
Robust sparse coding for face recognition
Handwritten Bangla digit recognition using Sparse Representation Classifier
Classification and clustering via dictionary learning with structured incoherence and shared features
Cited by (17)
Multi-feature kernel discriminant dictionary learning for face recognition
2017, Pattern RecognitionCitation Excerpt :So the MKSCDDL can be visualized as three steps: firstly, initialize X, V and W; secondly, update X, V and W until the algorithm converges or the maximum number of iterations reaches; thirdly, get the representation coefficient x of y and identify y, and classify y using W and x, which is based on V. As in our previous SCDDL study [5], three face databases including the Extended YaleB database [38], the AR face database [39] and the CMU PIE face database [40], were used to evaluate the performance of MKSCDDL here. Several samples from these three databases were shown in Fig. 1.
Facial expression recognition using dual dictionary learning
2017, Journal of Visual Communication and Image RepresentationGender dictionary learning for gender classification
2017, Journal of Visual Communication and Image RepresentationCitation Excerpt :A main problem of these studies is that a sub-set of real-world images is considered for gender classification and images that have large pose and heavy expression and so on are not considered for gender classification. Recently, sparse coding and dictionary learning have been successfully applied to different tasks of computer vision including face recognition, image classification, etc. [24–33,38–41]. The Sparse Representation Classification (SRC) was described by Wright et al. [29] for face recognition.
Data Science in Face Recognition System Based on BP Neural Network
2021, Journal of Physics: Conference SeriesLearning efficient structured dictionary for image classification
2020, Journal of Electronic Imaging
- ☆
This paper has been recommended for acceptance by M.T. Sun.