Supervised within-class-similar discriminative dictionary learning for face recognition

https://doi.org/10.1016/j.jvcir.2016.04.003Get rights and content

Highlights

  • A discriminative dictionary learning algorithm is put forward for face recognition.

  • The algorithm combines the linear classification error with the within-class scatter.

  • The fisher ratios of the coefficients are enhanced.

  • The proposed method outperforms some state-of-the-art dictionary learning algorithms.

Abstract

The current study puts forward a supervised within-class-similar discriminative dictionary learning (SCDDL) algorithm for face recognition. Some popular discriminative dictionary learning schemes for recognition tasks always incorporate the linear classification error term into the objective function or make some discriminative restrictions on representation coefficients. In the presented SCDDL algorithm, we propose to directly restrict the representation coefficients to be similar within the same class and simultaneously include the linear classification error term in the supervised dictionary learning scheme to derive a more discriminative dictionary for face recognition. The experimental results on three large well-known face databases suggest that our approach can enhance the fisher ratio of representation coefficients when compared with several dictionary learning algorithms that incorporate linear classifiers. In addition, the learned discriminative dictionary, the large fisher ratio of representation coefficients and the simultaneously learned classifier can improve the recognition rate compared with some state-of-the-art dictionary learning algorithms.

Introduction

With the development of sparse-coding algorithms on L0- or L1-norm optimizations [1], [2], [3], [4], [5], [6], [7], [8], [9], the sparse-representation-based technologies have been widely accepted on a variety of tasks in image processing, such as image restoration and compressive sensing [10], [11], [12]. Recently, the sparse-based representation is extended to recognition tasks in [13] and referred to as sparse-representation-based classification (SRC). SRC achieves encouraging results on face recognition [13], [14], digit recognition [15], [16], and disease diagnosis [17]. SRC assumes that each test sample can be expressed as a sparse linear combination of the atoms in the dictionary columns, where the resulting expression is assigned the class label with the minimum representation residual. For some SRC applications, especially in the early phases, all of the training samples are viewed as a dictionary. This practice, however, makes the sparse representation computationally time-consuming when facing large training sets [18]. Thus, dictionary learning (DL) algorithms for constructing small but representative dictionaries from the training samples have been proposed and investigated.

To derive a compact dictionary, a variety of DL algorithms have been developed and introduced [19], [20], [21]. The oldest DL algorithm is the method of optimal direction (MOD) [20]. Another popular DL method is K-SVD, which generalizes the K-means clustering process and updates the dictionary atoms while updating the sparse-coding to minimize the reconstruction error [19]. In [21], a meta-face-learning method was put forward to update the dictionary atoms with a closed-form solution. These DL algorithms are mainly developed to seek out an optimal dictionary to represent the data with minimum reconstruction error. They do not use the class label information of the data in the training sets. Thus these DL approaches can be referred to as the unsupervised DL. The dictionary or the representation coefficients learned from these DL algorithms may be effective for signal reconstruction but may not be discriminative enough for classification tasks.

For deriving discriminative dictionaries that are suitable for classification tasks, a number of supervised DL algorithms utilizing the class label information of the training samples have been developed. There are two categories of supervised DL methods in general. The first category consists of those constructing one dictionary for each class [16], [22], [23], [24], [25], [26]. For example, Ramirez et al. developed a DL algorithm with structured incoherence (DLSI) for the construction of class-specific dictionaries [16]. Yang et al. proposed a fisher discrimination DL (FDDL) algorithm to derive dictionaries for each class using restrictions on minimum within-class scatter and maximum between-class scatter [24]. Ma et al. put forward a discriminative low-rank DL method for sparse representation with low-rank constraint of the class-specific dictionaries [27]. Li et al. developed a discriminative DL algorithm with low-rank regularization and fisher discriminant function [28]. All of these methods can obtain one dictionary for each class and then identify the test samples with minimum reconstruction error. This category of supervised DL can derive representative dictionaries for each class, but the dictionary updating process and the sparse-coding process are complex if there are a large number of classes.

The second category of supervised DL algorithms derives one dictionary shared by all of the classes. In this category, the algorithms mainly incorporate discriminative terms into the objective function to obtain a good recognition rate [18], [29], [30], [31], [32], [33]. For example, the linear classification error is used as the discriminative term in [18], [31], [32], [33]. Other researchers adopt the logistic loss term in the objective function [30], [33]. Furthermore, Huang and Aviyente applied the fisher ratio between inter-class distance and within-class scatter as the discriminative constraint on the sparse coefficients [29]. These algorithms can be further spilt into two types: one directly incorporates the classification model (e.g., linear loss, logistic loss or hinge loss function) into the DL framework to obtain the simultaneously derived classifier for classifying test samples [18], [30], [31], [32], [33]; the other type imposes discriminative constraints on the representation coefficients (e.g., fisher criteria) and then employs the coefficients to construct a classifier for identifying test samples [29]. Thus, would the combination of classification models with restrictions on the representation coefficients further improve the discriminative power of the dictionary and enhance the recognition rate? Jiang et al. put forward an algorithm named LC-KSVD [18], which adds the classification error term and the discriminative sparse-code error term into the objective function. This algorithm achieves improved performance compared to those algorithms without the discriminative sparse-code error term, such as the D-KSVD [32] and K-SVD [19] algorithms. The discriminative sparse-code error term of label consistency regularization in LC-KSVD can force signals from the same class to have similar representations [18]. This indirect promotion of within-class similarity between coding coefficients from every class has been shown to improve the recognition rate of the dictionary. Thus, one question can be posed: can the direct restriction of coefficients to be similar within one class, combined with the linear classification error term, result in a more discriminative dictionary and further improve the recognition rate?

In this paper, we propose to directly restrict the within-class scatter of a dictionary’s representation coefficients while simultaneously deriving the linear classification error term in a supervised DL scheme, referred to as the supervised within-class-similar discriminative DL (SCDDL) algorithm. We examine the SCDDL performance using three well-known face databases, the Extended YaleB database [34], the AR face database [35] and the CMU PIE face database [36]. The results show that SCDDL can achieve superior recognition rates in comparison to LC-KSVD [18] and some other state-of-the-art DL algorithms.

The novel contributions of this paper are threefold. First, we propose to directly impose restrictions on within-class similarity of representation coefficients simultaneously with the derivation of the linear classification error in the supervised DL algorithm, to construct a more discriminative dictionary. Second, we demonstrate that directly adding the constraint on the within-class scatter term in the DL scheme can enhance the fisher criterion (the ratio of between-class scatter and within-class scatter) more than indirect constraint or no constraint (LC-KSVD, D-KSVD). Third, we show that the combination of directly restricted within-class scatter and a simultaneously derived classifier is a powerful tool for face recognition tasks, compared with several other state-of-the-art DL algorithms.

The organization of the rest paper is as follows. Section 2 introduces the DL-related algorithms. Section 3 describes the proposed SCDDL model. Section 4 presents the experimental results and Section 5 concludes the paper.

Section snippets

Sparse representation-based classification

Sparse-representation-based classification (SRC) was put forward by Wright et al. for face recognition [13]. Suppose n d-dimensional training samples from k classes can be denoted as A=[A1,,Al,,Ak]Rd×n, where AlRd×nl consists of nl training samples from class l(l=1,2,,k). When A is used directly as the dictionary (see below for a dictionary learning framework for constructing more compact dictionaries), then for a test sample yRd, the SRC framework can be as follows:

  • (i) Sparse-coding of y

Supervised within-class-similar discriminative dictionary learning

A novel DL scheme named supervised within-class-similar discriminative dictionary learning (SCDDL) is proposed in this section to create a discriminative dictionary for classification.

Databases

Three face databases including the Extended YaleB database [34], the AR face database [35] and the CMU PIE face database [36] are used to evaluate SCDDL in this study.

Extended YaleB [34]: The Extended YaleB database including varying illumination conditions and expressions consists 2414 face images from 38 subjects, with approximately 64 images for each subject. Sample images of two subjects are shown in Fig. 1. These images were cropped to 192 × 168 pixels R192×168. Half of the images are

Conclusions

This paper proposes a new DL algorithm for face recognition named supervised within-class-similar discriminative dictionary learning (SCDDL). The main contribution is the combination of direct restriction of the within-class-similar term of representation coefficients with the simultaneous restriction of linear classification error in the objective function, which is a kind of combination of FDDL and D-KSVD or LC-KSVD. Notably, the direct restriction of the within-class similar term, or the

Acknowledgements

This work is supported by the 863 Program (2015AA020912), the Funds for International Cooperation and Exchange of the National Natural Science Foundation of China (61210001), the General Program of National Natural Science Foundation of China (61571047) and the Fundamental Research Funds for the Central Universities.

References (43)

  • G.H. Mohimani et al.

    Fast Sparse Representation based on Smoothed ℓ0 Norm

    (2007)
  • L. Rosasco, A. Verri, M. Santoro, S. Mosci, S. Villa, Iterative Projection Methods for Structured Sparsity...
  • A.Y. Yang et al.

    Fast ℓ1-minimization algorithms and an application in robust face recognition: A review

  • K. Koh et al.

    L1 ls: A Matlab Solver for Large-scale l1-regularized Least Squares Problems

    (2007)
  • J. Mairal et al.

    Sparse representation for color image restoration

    IEEE Trans. Image Process.

    (2008)
  • W. Dong et al.

    Centralized sparse representation for image restoration

  • E.J. Candès et al.

    An introduction to compressive sampling

    Signal Process. Mag., IEEE

    (2008)
  • J. Wright et al.

    Robust face recognition via sparse representation

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2009)
  • M. Yang et al.

    Robust sparse coding for face recognition

  • H.A. Khan et al.

    Handwritten Bangla digit recognition using Sparse Representation Classifier

  • I. Ramirez et al.

    Classification and clustering via dictionary learning with structured incoherence and shared features

  • Cited by (17)

    • Multi-feature kernel discriminant dictionary learning for face recognition

      2017, Pattern Recognition
      Citation Excerpt :

      So the MKSCDDL can be visualized as three steps: firstly, initialize X, V and W; secondly, update X, V and W until the algorithm converges or the maximum number of iterations reaches; thirdly, get the representation coefficient x of y and identify y, and classify y using W and x, which is based on V. As in our previous SCDDL study [5], three face databases including the Extended YaleB database [38], the AR face database [39] and the CMU PIE face database [40], were used to evaluate the performance of MKSCDDL here. Several samples from these three databases were shown in Fig. 1.

    • Facial expression recognition using dual dictionary learning

      2017, Journal of Visual Communication and Image Representation
    • Gender dictionary learning for gender classification

      2017, Journal of Visual Communication and Image Representation
      Citation Excerpt :

      A main problem of these studies is that a sub-set of real-world images is considered for gender classification and images that have large pose and heavy expression and so on are not considered for gender classification. Recently, sparse coding and dictionary learning have been successfully applied to different tasks of computer vision including face recognition, image classification, etc. [24–33,38–41]. The Sparse Representation Classification (SRC) was described by Wright et al. [29] for face recognition.

    View all citing articles on Scopus

    This paper has been recommended for acceptance by M.T. Sun.

    View full text