Singular Value Decomposition Projection for solving the small sample size problem in face recognition
Introduction
Face recognition (FR) has been one of the hottest and most challenging research topics in computer vision and pattern recognition over the past two decades [1], [2], [3]. The target of FR is to identify a person by using a digit face image from a photograph or a video frame sequence of him/her. Although many research groups have devoted their efforts and obtained much success under a variety of conditions, there still exist many unsolved problems in the research field of FR [4]. In particular, in many real-world FR scenarios, faces are always with large pose and illumination variations, but there are usually only a few images per person recorded due to the limited storage space and capture time. Under these circumstances, the task of FR is actually a small sample size (SSS) classification problem. Thus, how to learn a good classification rule by using limited training samples is one fundamental and challenging task in real-world applications of FR.
Recently, Wright et al. [5] proposed a Sparse Representation based Classification (SRC) framework for FR based on -norm. The main idea of SRC can be summarized as the following three steps: (a) to compute the sparse representation of a test sample on the training data set at first; (b) then, to calculate the reconstruction errors which reconstruct the test sample by sparse representation coefficients that related to the training samples of each class respectively; (c) finally, to classify the test sample to the class which possesses the minimum reconstruction residual error. This method has been proven to be effective for robust FR to cope with varying expression and illumination. However, face data are usually provided with high dimensionality. In the research field of FR, the training samples are general stored in a matrix with each column corresponding to a sample (i.e., a face image). Suppose that the dimensionality of each sample to be m and the number of samples to be n. In general, m is much larger than n. According to Compressed Sensing (CS) theory [6], [7], [8], the minimum -norm solution to an underdetermined system of linear equation () is also the sparsest possible solution under general conditions. Because the first step of SRC is to solve a -minimization problem, it is necessary to first perform dimensionality reduction so that the equation is underdetermined to get a sparse representation [5]. In the meantime, dimensionality reduction can also eliminate the noise included in the data to some degree [9], [10]. Therefore, dimensionality reduction is a very essential step to obtain good performance when employing SRC to implement FR.
According to whether the information provided by class label is used or not, the current existing dimensionality reduction methods can be roughly grouped into three categories: supervised methods, semi-supervised methods and unsupervised methods. In unsupervised techniques, PCA [11], Random Projection [12] and Downsampling [13] are the three prominent representatives and they were proposed by Wright et al. [5]. Due to its simplicity and effectiveness to some practical applications, PCA is made popularized throughout the domains of scientific research and engineering. Random Projection has the good property that its computation is simple and efficient. At the same time, random projection can approximately preserve the local structure of the original data. In FR community, Downsampling is helpful for reducing the difference caused by varying facial expression and pose between images of a same face.
Some previous work has demonstrated that human faces containing complex variations can be effectively modeled by low-dimensional linear subspaces [14], [15]. Furthermore, extracting low-dimensional structures of the face data cannot only reduce the computational cost and storage need of FR algorithms, but also mitigate the effect of high-dimensional noise in the data so that the performance of the corresponding classification methods can be further improved. However, most of the traditional dimensionality reduction methods commonly require as many training samples as possible per person, and their performance in general heavily depends on the number of training samples. As the number of training samples decreases, many feature extraction methods such as the most popular Eigenface [11] will perform badly because the intra-personal and inter-personal variations will be estimated with large bias in this situation [16]. Therefore, some more advanced dimensionality reduction method is required for the SSS problem.
In this paper, we present a novel unsupervised dimensionality reduction method called Singular Value Decomposition Projection (SVDP), and a new FR method SVDP-SRC which could better fit SRC for solving SSS problems. The proposed dimensionality reduction method is motivated by two -minimization solvers magic [17] and GPSR [18]. Since they achieved better performance in accuracy by orthonormalizing the rows of , as compared with using the original matrix directly. Thus, a purpose of SVDP is to make the projection sample matrix row-orthonormal. Meanwhile, for the SSS problem, due to dimensionality reduction requirement and lack of samples, each feature of samples in the subspace is critical. Another purpose of SVDP is to make the features of samples in the subspace have significant difference and less correlation with each other. The experiment results obtained on a synthetic data set and three standard FR databases show that SVDP achieves better results than some other conventional baseline methods.
The advantages of our proposed dimensionality reduction algorithm (SVDP) can be summarized as follows:
- (1)
The new linear transformation produced by SVDP makes the obtained projection row-orthonormal. This projection is well-conditioned since it makes the problem solving cost less time and it is more robust to small perturbations.
- (2)
In SVDP, row-orthonormal makes the features of obtained projection samples do not correlate with each other, and the variances of the features are roughly same. It enables our method to capture more of the signal variation and better represent the test sample.
- (3)
SVDP keeps the simplicity and effectiveness of an unsupervised dimensionality reduction algorithm, which avoids a drawn-out process of tuning various parameters.
The rest of this paper is organized as follows. Section 2 presents a brief description of the theoretical foundation of SVD and SRC as well as some related works. This is followed by introducing our proposed novel SVDP-SRC algorithm to efficiently implement face recognition in Section 3. In Section 4, some experiments are carried out on a synthetic data set and three standard FR databases to examine the performance of the proposed algorithm and compare it with several other baseline methods. Finally, Section 5 offers the conclusions of this paper.
Vectors are donated by bold lower case letters and matrices by bold upper case letters. For any vector , we denote by and . The input data matrix , while the reduced low-dimensional data matrix is formed with d features and k samples. We assume that and are the ith column of and jth row of respectively. Thus, .
Section snippets
Dimensionality reduction
Dimensionality reduction is an essential technique for processing high-dimensional data. The goal of dimensionality reduction is to minimize redundancy of the original data. Up to now, linear techniques are still important due to their mathematical tractability and effectiveness for many real-world problems such as face recognition. The generic problem of a linear dimensionality reduction can be put into the following framework. Given a signal (or an image with vector pattern) , and a
Singular Value Decomposition Projection for Sparse Representation based Classification
As shown in Section 2, SRC classifies a test sample based on its sparse representation coefficient α. Ideally, the nonzero entries of α should concentrate on the training samples which belong to the same class as the test sample. In some typical applications of face recognition in uncontrolled environments, however, the training samples may be not enough to represent the test sample. Thus, the intra-class variation can hardly be considered against the inter-class variation, and the residual
Experimental study
In this section, a synthetic data set and three real face databases AR [29], PIE [30] and FERET [31] will be used for evaluating the performance of our proposed method. The aim of the first experiment conducted with some synthetic data is to compare the recovery accuracy of SVDP as well as some other algorithms. As for the second experiment done with real data AR and PIE databases, we will compare the recognition accuracy of the considered algorithms. Since the focus of this paper is
Conclusions
In this paper, an effective Sparse Representation based Classification algorithm assisted by a novel dimensionality reduction method-Singular Value Decomposition Projection (SVDP) is presented for robust face recognition when the number of sample is limited (i.e., small sample size problem). The main aim of SVDP is to utilize the transformation matrix to reweight the reconstructive relationship of the original data in the low-dimensional subspace and make the projection sample matrix
Acknowledgements
This work was supported by the National Basic Research Program of China (973 Program) under Grant No. 2013CB329404, the Major Research Project of the National Natural Science Foundation of China under Grant No. 91230101, the National Natural Science Foundation of China under Grant No. 61075006 and 11201367, and the Key Project of the National Natural Science Foundation of China under Grant No. 11131006.
References (32)
- et al.
Automatic recognition and analysis of human faces and facial expressions: a survey
Pattern Recognit.
(1992) - et al.
Automatic facial expression analysis: a survey
Pattern Recognit.
(2003) - et al.
Robust recognition using Eigenimages
Comput. Vis. Image Understand.
(2000) - et al.
Face recognition via weighed sparse representation
J. Visual Commun. Image Represent.
(2013) - et al.
A new decision rule for sparse representation based classification for face recognition
Neurocomputing
(2013) Condition numbers and their condition numbers
Linear Algebra Appl.
(1995)- et al.
Multi-PIE
Image Vis. Comput.
(2010) - et al.
Face recognition: a literature survey
ACM Comput. Surv.
(2003) - et al.
Introduction to the special section on the real-world face
IEEE Trans. Pattern Anal. Mach. Int.
(2011) - et al.
Robust face recognition via sparse representation
IEEE Trans. Pattern Anal. Mach. Intell.
(2009)
For most large underdetermined systems of linear equations the minimal -norm solution is also the sparsest solution
Commun. Pure Appl. Math.
Atomic decomposition by basis pursuit
SIAM Rev.
From sparse solutions of systems of equations to sparse modeling of signals and images
SIAM Rev.
Orthogonal neighborhood preserving projections: a projection-based dimensionality reduction technique
IEEE Trans. Pattern Anal. Mach. Intell.
Eigenfaces for recognition
J. Cognit. Neurosci.
Face recognition experiments with random projection
Int. Soc. Optics Photonics
Cited by (8)
A Bayesian assumption based forecasting probability distribution model for small samples
2018, Computers and Electrical EngineeringCitation Excerpt :The small sample size problem (SSSP) is a hot topic in current academic research. For example, in tasks pertaining face recognition [4] or speech emotion [5], the lack of large sample size is a challenge. To that end, loosen control condition (LCC) [6] has been applied as valid method for a small sample set.
High-dimensional feature extraction using bit-plane decomposition of local binary patterns for robust face recognition
2017, Journal of Visual Communication and Image RepresentationCitation Excerpt :The HD feature, however, does not often satisfy the non-singularity condition due to its extremely high dimensionality. Since the feature dimension of the HD feature tends to be much larger than the number of training samples, the HD feature is frequently confronted with an under-sampled problem [21], resulting in the singularity of the scatter matrices. To solve this problem, we adopt the OLDA [22] employing simultaneous diagonalization of the scatter matrices, in which the non-singularity constraint is not required.
Surveillance video face recognition with single sample per person based on 3D modeling and blurring
2017, NeurocomputingCitation Excerpt :For example, in some specific scenarios (e.g. law enforcement, driver license, passport and identification card) only one image per person could be acquired for the training of face recognition systems. The SSPP problem made face recognition become more difficult because little information might be extracted from the gallery set with single sample per person to predict the variations in the query faces [28]. Since the intra-class variations could not be well estimated from single training sample, the traditional discriminative subspace learning based face recognition methods could fail to work well.
Extended common molecular and discriminative atom dictionary based sparse representation for face recognition
2016, Journal of Visual Communication and Image RepresentationCitation Excerpt :It learned the auxiliary dictionary from external data to observe possible image variant. To better preserve the discriminative information, Wang et al. decomposed the original data matrix by singular value decomposition projection [17]. Wei et al. designed a locality-sensitive dictionary learning method which preserved local data structure and resulted in improving image classification performance [18].
A survey on techniques to handle face recognition challenges: occlusion, single sample per subject and expression
2019, Artificial Intelligence ReviewTexture analysis based feature extraction using Gabor filter and SVD for reliable fault diagnosis of an induction motor
2018, International Journal of Information Technology and Management