Singular Value Decomposition Projection for solving the small sample size problem in face recognition

https://doi.org/10.1016/j.jvcir.2014.09.013Get rights and content

Highlights

  • The new linear transformation produced by SVDP makes the obtained projection row-orthonormal.

  • In SVDP, row-orthonormal makes the features of obtained projection samples do not correlate with each other.

  • SVDP keeps the simplicity and effectiveness of an unsupervised dimensionality reduction algorithm.

Abstract

Numerous dimensionality reduction methods have achieved impressive performance in face recognition field due to their potential to exploit the intrinsic structure of images and to enhance the computational efficiency. However, the FR methods based on the existing dimensionality reduction often suffer from small sample size (SSS) problems, where the sample dimensionality is larger than the number of training samples per subject. In recent years, Sparse Representation based Classification (SRC) has been demonstrated to be a powerful framework for robust FR. In this paper, a novel unsupervised dimensionality reduction algorithm, called Singular Value Decomposition Projection (SVDP), is proposed to better fit SRC for handling the SSS problems in FR. In SVDP, a weighted linear transformation matrix is derived from the original data matrix via Singular Value Decomposition. The projection obtained in this way is row-orthonormal and it has some good properties. It makes the solution be robust to small perturbations contained in the data and has better ability to represent various signals. Thus, SVDP could better preserve the discriminant information of the data. Based on SVDP, a novel face recognition method SVDP-SRC is designed to enable SRC to achieve better performance via low-dimensional representation of faces. The experiments carried out with some simulated data show that SVDP achieves higher recovery accuracy than several other dimensionality reduction methods. Moreover, the results obtained on three standard face databases demonstrate that SVDP-SRC is quite effective to handle the SSS problems in terms of recognition accuracy.

Introduction

Face recognition (FR) has been one of the hottest and most challenging research topics in computer vision and pattern recognition over the past two decades [1], [2], [3]. The target of FR is to identify a person by using a digit face image from a photograph or a video frame sequence of him/her. Although many research groups have devoted their efforts and obtained much success under a variety of conditions, there still exist many unsolved problems in the research field of FR [4]. In particular, in many real-world FR scenarios, faces are always with large pose and illumination variations, but there are usually only a few images per person recorded due to the limited storage space and capture time. Under these circumstances, the task of FR is actually a small sample size (SSS) classification problem. Thus, how to learn a good classification rule by using limited training samples is one fundamental and challenging task in real-world applications of FR.

Recently, Wright et al. [5] proposed a Sparse Representation based Classification (SRC) framework for FR based on 1-norm. The main idea of SRC can be summarized as the following three steps: (a) to compute the sparse representation of a test sample on the training data set at first; (b) then, to calculate the reconstruction errors which reconstruct the test sample by sparse representation coefficients that related to the training samples of each class respectively; (c) finally, to classify the test sample to the class which possesses the minimum reconstruction residual error. This method has been proven to be effective for robust FR to cope with varying expression and illumination. However, face data are usually provided with high dimensionality. In the research field of FR, the training samples are general stored in a matrix A with each column corresponding to a sample (i.e., a face image). Suppose that the dimensionality of each sample to be m and the number of samples to be n. In general, m is much larger than n. According to Compressed Sensing (CS) theory [6], [7], [8], the minimum 1-norm solution to an underdetermined system of linear equation y=Ax (ARm×n,m<n) is also the sparsest possible solution under general conditions. Because the first step of SRC is to solve a 1-minimization problem, it is necessary to first perform dimensionality reduction so that the equation is underdetermined to get a sparse representation [5]. In the meantime, dimensionality reduction can also eliminate the noise included in the data to some degree [9], [10]. Therefore, dimensionality reduction is a very essential step to obtain good performance when employing SRC to implement FR.

According to whether the information provided by class label is used or not, the current existing dimensionality reduction methods can be roughly grouped into three categories: supervised methods, semi-supervised methods and unsupervised methods. In unsupervised techniques, PCA [11], Random Projection [12] and Downsampling [13] are the three prominent representatives and they were proposed by Wright et al. [5]. Due to its simplicity and effectiveness to some practical applications, PCA is made popularized throughout the domains of scientific research and engineering. Random Projection has the good property that its computation is simple and efficient. At the same time, random projection can approximately preserve the local structure of the original data. In FR community, Downsampling is helpful for reducing the difference caused by varying facial expression and pose between images of a same face.

Some previous work has demonstrated that human faces containing complex variations can be effectively modeled by low-dimensional linear subspaces [14], [15]. Furthermore, extracting low-dimensional structures of the face data cannot only reduce the computational cost and storage need of FR algorithms, but also mitigate the effect of high-dimensional noise in the data so that the performance of the corresponding classification methods can be further improved. However, most of the traditional dimensionality reduction methods commonly require as many training samples as possible per person, and their performance in general heavily depends on the number of training samples. As the number of training samples decreases, many feature extraction methods such as the most popular Eigenface [11] will perform badly because the intra-personal and inter-personal variations will be estimated with large bias in this situation [16]. Therefore, some more advanced dimensionality reduction method is required for the SSS problem.

In this paper, we present a novel unsupervised dimensionality reduction method called Singular Value Decomposition Projection (SVDP), and a new FR method SVDP-SRC which could better fit SRC for solving SSS problems. The proposed dimensionality reduction method is motivated by two 1-minimization solvers 1 magic [17] and GPSR [18]. Since they achieved better performance in accuracy by orthonormalizing the rows of A, as compared with using the original matrix A directly. Thus, a purpose of SVDP is to make the projection sample matrix row-orthonormal. Meanwhile, for the SSS problem, due to dimensionality reduction requirement and lack of samples, each feature of samples in the subspace is critical. Another purpose of SVDP is to make the features of samples in the subspace have significant difference and less correlation with each other. The experiment results obtained on a synthetic data set and three standard FR databases show that SVDP achieves better results than some other conventional baseline methods.

The advantages of our proposed dimensionality reduction algorithm (SVDP) can be summarized as follows:

  • (1)

    The new linear transformation produced by SVDP makes the obtained projection row-orthonormal. This projection is well-conditioned since it makes the problem solving cost less time and it is more robust to small perturbations.

  • (2)

    In SVDP, row-orthonormal makes the features of obtained projection samples do not correlate with each other, and the variances of the features are roughly same. It enables our method to capture more of the signal variation and better represent the test sample.

  • (3)

    SVDP keeps the simplicity and effectiveness of an unsupervised dimensionality reduction algorithm, which avoids a drawn-out process of tuning various parameters.

The rest of this paper is organized as follows. Section 2 presents a brief description of the theoretical foundation of SVD and SRC as well as some related works. This is followed by introducing our proposed novel SVDP-SRC algorithm to efficiently implement face recognition in Section 3. In Section 4, some experiments are carried out on a synthetic data set and three standard FR databases to examine the performance of the proposed algorithm and compare it with several other baseline methods. Finally, Section 5 offers the conclusions of this paper.

Vectors are donated by bold lower case letters and matrices by bold upper case letters. For any vector αRk, we denote by α2=i=1kαi21/2 and α1=i=1k|αi|. The input data matrix ARm×k, while the reduced low-dimensional data matrix YRd×k is formed with d features and k samples. We assume that y{:,i}Rd and y{j,:}Rk are the ith column of Y and jth row of Y respectively. Thus, Y=[y{:,1},y{:,2},,y{:,k}]=y{1,:}T,y{2,:}T,,y{d,:}TT.

Section snippets

Dimensionality reduction

Dimensionality reduction is an essential technique for processing high-dimensional data. The goal of dimensionality reduction is to minimize redundancy of the original data. Up to now, linear techniques are still important due to their mathematical tractability and effectiveness for many real-world problems such as face recognition. The generic problem of a linear dimensionality reduction can be put into the following framework. Given a signal (or an image with vector pattern) aiRm, and a

Singular Value Decomposition Projection for Sparse Representation based Classification

As shown in Section 2, SRC classifies a test sample based on its sparse representation coefficient α. Ideally, the nonzero entries of α should concentrate on the training samples which belong to the same class as the test sample. In some typical applications of face recognition in uncontrolled environments, however, the training samples may be not enough to represent the test sample. Thus, the intra-class variation can hardly be considered against the inter-class variation, and the residual

Experimental study

In this section, a synthetic data set and three real face databases AR [29], PIE [30] and FERET [31] will be used for evaluating the performance of our proposed method. The aim of the first experiment conducted with some synthetic data is to compare the recovery accuracy of SVDP as well as some other algorithms. As for the second experiment done with real data AR and PIE databases, we will compare the recognition accuracy of the considered algorithms. Since the focus of this paper is

Conclusions

In this paper, an effective Sparse Representation based Classification algorithm assisted by a novel dimensionality reduction method-Singular Value Decomposition Projection (SVDP) is presented for robust face recognition when the number of sample is limited (i.e., small sample size problem). The main aim of SVDP is to utilize the transformation matrix to reweight the reconstructive relationship of the original data in the low-dimensional subspace and make the projection sample matrix

Acknowledgements

This work was supported by the National Basic Research Program of China (973 Program) under Grant No. 2013CB329404, the Major Research Project of the National Natural Science Foundation of China under Grant No. 91230101, the National Natural Science Foundation of China under Grant No. 61075006 and 11201367, and the Key Project of the National Natural Science Foundation of China under Grant No. 11131006.

References (32)

  • D. Donoho

    For most large underdetermined systems of linear equations the minimal 1-norm solution is also the sparsest solution

    Commun. Pure Appl. Math.

    (2006)
  • S. Chen et al.

    Atomic decomposition by basis pursuit

    SIAM Rev.

    (2001)
  • A. Brukstein et al.

    From sparse solutions of systems of equations to sparse modeling of signals and images

    SIAM Rev.

    (2009)
  • Effrosyni Kokiopoulou et al.

    Orthogonal neighborhood preserving projections: a projection-based dimensionality reduction technique

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • M. Turk et al.

    Eigenfaces for recognition

    J. Cognit. Neurosci.

    (1991)
  • N. Goal et al.

    Face recognition experiments with random projection

    Int. Soc. Optics Photonics

    (2005)
  • Cited by (8)

    • A Bayesian assumption based forecasting probability distribution model for small samples

      2018, Computers and Electrical Engineering
      Citation Excerpt :

      The small sample size problem (SSSP) is a hot topic in current academic research. For example, in tasks pertaining face recognition [4] or speech emotion [5], the lack of large sample size is a challenge. To that end, loosen control condition (LCC) [6] has been applied as valid method for a small sample set.

    • High-dimensional feature extraction using bit-plane decomposition of local binary patterns for robust face recognition

      2017, Journal of Visual Communication and Image Representation
      Citation Excerpt :

      The HD feature, however, does not often satisfy the non-singularity condition due to its extremely high dimensionality. Since the feature dimension of the HD feature tends to be much larger than the number of training samples, the HD feature is frequently confronted with an under-sampled problem [21], resulting in the singularity of the scatter matrices. To solve this problem, we adopt the OLDA [22] employing simultaneous diagonalization of the scatter matrices, in which the non-singularity constraint is not required.

    • Surveillance video face recognition with single sample per person based on 3D modeling and blurring

      2017, Neurocomputing
      Citation Excerpt :

      For example, in some specific scenarios (e.g. law enforcement, driver license, passport and identification card) only one image per person could be acquired for the training of face recognition systems. The SSPP problem made face recognition become more difficult because little information might be extracted from the gallery set with single sample per person to predict the variations in the query faces [28]. Since the intra-class variations could not be well estimated from single training sample, the traditional discriminative subspace learning based face recognition methods could fail to work well.

    • Extended common molecular and discriminative atom dictionary based sparse representation for face recognition

      2016, Journal of Visual Communication and Image Representation
      Citation Excerpt :

      It learned the auxiliary dictionary from external data to observe possible image variant. To better preserve the discriminative information, Wang et al. decomposed the original data matrix by singular value decomposition projection [17]. Wei et al. designed a locality-sensitive dictionary learning method which preserved local data structure and resulted in improving image classification performance [18].

    View all citing articles on Scopus
    View full text