Enhanced (PC)2A for face recognition with one training image per person
Introduction
Face recognition has been an active research area of computer vision and pattern recognition for decades (Turk and Pentland, 1991; Brunelli and Poggio, 1993; Chellappa et al., 1995; Moghaddam and Pentland, 1997; Moghaddam et al., 2000; Sukthankar, 2000; Wiskott et al., 1997; Zhao et al., 2000; Chen and Huang, 2003). Many face recognition methods have been proposed to date and according to (Brunelli and Poggio, 1993), these methods can be roughly classified into two categories, i.e. geometric feature-based and template-based. In the first category, the most often used method is the elastic bunch graph matching (Wiskott et al., 1997), while in the second category, the most widely used algorithm is the eigenface (Turk and Pentland, 1991). Recently, neural networks (Valentin et al., 1994; Lawrence et al., 1997; Zhang et al., 1997; Raytchev and Murase, 2003), support vector machines (Pang et al., 2003), kernel methods (Lu et al., 2003), and ensemble techniques (Pang et al., 2003) also find great applications in this area.
In some specific scenarios such as law enforcement, only one image per person can be used for training the face recognition system. It is unfortunate that most of the face recognition algorithms may have problems in such scenarios. For example, most subspace methods such as Linear Discriminant Analysis (LDA) (Etemad and Chellappa, 1997; Lu et al., 2003), discriminant eigenfeatures (Swets and Weng, 1996) and fisherface (Belhumeur et al., 1997) can hardly be used because in order to obtain good recognition performance, they require there exist at least two training images per person so that the intra-class variation could be considered against the inter-class variation. Recently, a few researchers begin to address this issue (Wu and Zhou, 2002; Martinez, 2002). In (Wu and Zhou, 2002), a method called (PC)2A was proposed as an extension of the standard eigenface technique, in which each training image used is a linearly combining version of the original face image with its first-order projected image, then principal component analysis (PCA) is performed on the combined training image set. It was reported that (PC)2A outperformed the standard eigenface technique when only one training image per person is available (Wu and Zhou, 2002). In (Martinez, 2002), a probabilistic approach was described to model faces, where the key model parameters were estimated by using a set of the images which consist of slightly perturbed in position, partially occluded and expression-variant faces around a representative face image.
In this paper, we follow the line of Wu and Zhou (2002) but generalize and enhance (PC)2A in two ways. In the first way, besides still adopting the first-order projected image, we construct second-order projected images as well for each training face image, then combine linearly the corresponding first and second-order projected images with the original image into a new image, and afterwards, perform PCA on a set of the newly-combined training images. In the second way, instead of combining the original image with the corresponding projected images, we purposefully enlarge the original training image database using and adding a series of n-order projected images. That is to say, if there are M face images in the image database corresponding to M different persons with exact one image each person, we can generate n additional images for each person and therefore can obtain an enlarged training database comprising (n+1)M face images including M original images. Then we perform PCA on the enlarged image database. The idea behind these two ways is to squeeze as much information as possible from the original single face images. Such information implies some salient features that may be important in face recognition with one training image per person, therefore we get the first extended version in the first direction. On the other hand, these information can also be used to provide each person with several imitated face images so that the problem of face recognition with one training image per person becomes a common face recognition one, therefore we get the second extended version along the second direction. Experiments have been performed on a subset of the well-known FERET database, and the experimental results show that both the enhanced versions of (PC)2A get improved recognition accuracy while the number of eigenfaces used is only about half of that used by (PC)2A.
The rest of this paper is organized as follows. In Section 2, we present the ways to generalize and enhance (PC)2A. In Section 3, we report our experiments. Finally in Section 4, we conclude.
Section snippets
E(PC)2A1 and E(PC)2A2
In (PC)2A, each original (training) image I(x,y) is linearly combined with its first-order projection into a new version of the original image. It was demonstrated that such a combination is helpful to subsequent recognition process (Wu and Zhou, 2002). Therefore, a natural extension of (PC)2A is to desire exploiting much information such as higher-order projections of original images to enhance the recognition process. The projection definition for original image is briefly given as follows
Data set
In our experiments, the new methods presented in Section 2 are compared with both (PC)2A and the standard eigenface technique. The experimental configuration is similar as that was described in (Wu and Zhou, 2002). The experimental face database comprises 400 gray-level frontal view face images from 200 persons, with the size of 256 × 384. There are 71 females and 129 males; each person has two images (fa and fb) with different facial expressions. The fa images are used as gallery for training
Conclusions
Most face recognition techniques require that there exist at least two training images per person. Recently, a method called (PC)2A was proposed to address the issue of face recognition with one training image per person. In this paper, two directions for generalizing and enhancing (PC)2A are identified and several new algorithms are proposed. These algorithms utilize second-order projections, even higher-order information, besides the first-order projection used by (PC)2A. Experiments show
Acknowledgements
This work was supported by the National Natural Science Foundation of China under the Grant no. 60271017, the National Outstanding Youth Foundation of China under the Grant no. 60325237, the Jiangsu Science Foundation under Grant no. BK2002092, and the Qinglan project foundation of Jiangsu province. Portions of the research in this paper use the FERET database of facial images collected under the FERET program.
References (22)
- et al.
Facial expression recognition: A clustering-based approach
Pattern Recognition Lett.
(2003) - et al.
Bayesian face recognition
Pattern Recognition
(2000) - et al.
Membership authentication in the dynamic group by face classification using SVM ensemble
Pattern Recognition Lett.
(2003) - et al.
The FERET database and evaluation procedure for face-recognition algorithms
Image Vision Comput.
(1998) - et al.
Unsupervised face recognition by associative chaining
Pattern Recognition
(2003) - et al.
Connectionist models of face processing: A survey
Pattern Recognition
(1994) - et al.
Face recognition with one training image per person
Pattern Recognition Lett.
(2002) - et al.
Eigenfaces vs. fisherfaces: Recognition using class specific linear projection
IEEE Trans. Pattern Anal. Machine Intell.
(1997) Neural Networks for Pattern Recognition
(1995)- et al.
Face recognition: Features versus templates
IEEE Trans. Pattern Anal. Machine Intell.
(1993)
Human and machine recognition of faces: A survey
Proc. IEEE
Cited by (122)
Bidirectional feature selection with global and local structure preservation for small size samples
2018, Cognitive Systems ResearchBinarized features with discriminant manifold filters for robust single-sample face recognition
2018, Signal Processing: Image CommunicationLow-resolution face recognition with single sample per person
2017, Signal ProcessingCitation Excerpt :Face recognition for single sample per person (SSPP), i.e., conducting face recognition when there is only one sample of each subject in the training set, has been a topical issue for recent years, because its significant application in law enforcement. Many techniques have been proposed to deal with SSPP face recognition [1–6], and all of them so far are designed for high-resolution (HR) face images. In the last decade, for security and law enforcement purposes, an ever-increased number of surveillance cameras have been installed in public area.
Surveillance video face recognition with single sample per person based on 3D modeling and blurring
2017, NeurocomputingCitation Excerpt :It was beyond question that video face recognition was still one of the primary sophisticated topics. Even so, as two of the most prominent subspace approach in feature extraction process for face recognition, both Principal Component Analysis (PCA) [18,19] and Fisher's Linear Discriminant Analysis (FLDA)[20,21] have always been employed in video face recognition field. Face recognition based on PCA was a technique based global face features.
Sparse discriminative multi-manifold embedding for one-sample face identification
2016, Pattern RecognitionAdaptive appearance model tracking for still-to-video face recognition
2016, Pattern RecognitionCitation Excerpt :In synthetic generation, multiple virtual face images are generated from each single reference still to enhance gallery-face-models. Multiple virtual views are synthesized by linear shape prediction [29], mesh warping [30], morphing [31], symmetry property [32], partitioning a face in several sub-images [33], affine transformation [34], noise perturbation [35], shifting [36], and active appearance model [37]. A recurring problem with the synthetic generation is that they need to locate facial components reliably to determine the pose angle for pose compensation.