Elsevier

Pattern Recognition Letters

Volume 25, Issue 10, 16 July 2004, Pages 1173-1181
Pattern Recognition Letters

Enhanced (PC)2A for face recognition with one training image per person

https://doi.org/10.1016/j.patrec.2004.03.012Get rights and content

Abstract

Recently, a method called (PC)2A was proposed to deal with face recognition with one training image per person. As an extension of the standard eigenface technique, (PC)2A combines linearly each original face image with its corresponding first-order projection into a new face and then performs principal component analysis (PCA) on a set of the newly combined (training) images. It was reported that (PC)2A could achieve higher accuracy than the eigenface technique through using 10–15% fewer eigenfaces. In this paper, we generalize and further enhance (PC)2A along two directions. In the first direction, we combine the original image with its second-order projections as well as its first-order projection in order to acquire more information from the original face, and then similarly apply PCA to such a set of the combined images. In the second direction, instead of combining them, we still regard the projections of each original image as single derived images to augment training image set, and then perform PCA on all the training images available, including the original ones and the derived ones. Experiments on the well-known FERET database show that the enhanced versions of (PC)2A are about 1.6–3.5% more accurate and use about 47.5–64.8% fewer eigenfaces than (PC)2A.

Introduction

Face recognition has been an active research area of computer vision and pattern recognition for decades (Turk and Pentland, 1991; Brunelli and Poggio, 1993; Chellappa et al., 1995; Moghaddam and Pentland, 1997; Moghaddam et al., 2000; Sukthankar, 2000; Wiskott et al., 1997; Zhao et al., 2000; Chen and Huang, 2003). Many face recognition methods have been proposed to date and according to (Brunelli and Poggio, 1993), these methods can be roughly classified into two categories, i.e. geometric feature-based and template-based. In the first category, the most often used method is the elastic bunch graph matching (Wiskott et al., 1997), while in the second category, the most widely used algorithm is the eigenface (Turk and Pentland, 1991). Recently, neural networks (Valentin et al., 1994; Lawrence et al., 1997; Zhang et al., 1997; Raytchev and Murase, 2003), support vector machines (Pang et al., 2003), kernel methods (Lu et al., 2003), and ensemble techniques (Pang et al., 2003) also find great applications in this area.

In some specific scenarios such as law enforcement, only one image per person can be used for training the face recognition system. It is unfortunate that most of the face recognition algorithms may have problems in such scenarios. For example, most subspace methods such as Linear Discriminant Analysis (LDA) (Etemad and Chellappa, 1997; Lu et al., 2003), discriminant eigenfeatures (Swets and Weng, 1996) and fisherface (Belhumeur et al., 1997) can hardly be used because in order to obtain good recognition performance, they require there exist at least two training images per person so that the intra-class variation could be considered against the inter-class variation. Recently, a few researchers begin to address this issue (Wu and Zhou, 2002; Martinez, 2002). In (Wu and Zhou, 2002), a method called (PC)2A was proposed as an extension of the standard eigenface technique, in which each training image used is a linearly combining version of the original face image with its first-order projected image, then principal component analysis (PCA) is performed on the combined training image set. It was reported that (PC)2A outperformed the standard eigenface technique when only one training image per person is available (Wu and Zhou, 2002). In (Martinez, 2002), a probabilistic approach was described to model faces, where the key model parameters were estimated by using a set of the images which consist of slightly perturbed in position, partially occluded and expression-variant faces around a representative face image.

In this paper, we follow the line of Wu and Zhou (2002) but generalize and enhance (PC)2A in two ways. In the first way, besides still adopting the first-order projected image, we construct second-order projected images as well for each training face image, then combine linearly the corresponding first and second-order projected images with the original image into a new image, and afterwards, perform PCA on a set of the newly-combined training images. In the second way, instead of combining the original image with the corresponding projected images, we purposefully enlarge the original training image database using and adding a series of n-order projected images. That is to say, if there are M face images in the image database corresponding to M different persons with exact one image each person, we can generate n additional images for each person and therefore can obtain an enlarged training database comprising (n+1)M face images including M original images. Then we perform PCA on the enlarged image database. The idea behind these two ways is to squeeze as much information as possible from the original single face images. Such information implies some salient features that may be important in face recognition with one training image per person, therefore we get the first extended version in the first direction. On the other hand, these information can also be used to provide each person with several imitated face images so that the problem of face recognition with one training image per person becomes a common face recognition one, therefore we get the second extended version along the second direction. Experiments have been performed on a subset of the well-known FERET database, and the experimental results show that both the enhanced versions of (PC)2A get improved recognition accuracy while the number of eigenfaces used is only about half of that used by (PC)2A.

The rest of this paper is organized as follows. In Section 2, we present the ways to generalize and enhance (PC)2A. In Section 3, we report our experiments. Finally in Section 4, we conclude.

Section snippets

E(PC)2A1 and E(PC)2A2

In (PC)2A, each original (training) image I(x,y) is linearly combined with its first-order projection into a new version of the original image. It was demonstrated that such a combination is helpful to subsequent recognition process (Wu and Zhou, 2002). Therefore, a natural extension of (PC)2A is to desire exploiting much information such as higher-order projections of original images to enhance the recognition process. The projection definition for original image is briefly given as follows

Data set

In our experiments, the new methods presented in Section 2 are compared with both (PC)2A and the standard eigenface technique. The experimental configuration is similar as that was described in (Wu and Zhou, 2002). The experimental face database comprises 400 gray-level frontal view face images from 200 persons, with the size of 256 × 384. There are 71 females and 129 males; each person has two images (fa and fb) with different facial expressions. The fa images are used as gallery for training

Conclusions

Most face recognition techniques require that there exist at least two training images per person. Recently, a method called (PC)2A was proposed to address the issue of face recognition with one training image per person. In this paper, two directions for generalizing and enhancing (PC)2A are identified and several new algorithms are proposed. These algorithms utilize second-order projections, even higher-order information, besides the first-order projection used by (PC)2A. Experiments show

Acknowledgements

This work was supported by the National Natural Science Foundation of China under the Grant no. 60271017, the National Outstanding Youth Foundation of China under the Grant no. 60325237, the Jiangsu Science Foundation under Grant no. BK2002092, and the Qinglan project foundation of Jiangsu province. Portions of the research in this paper use the FERET database of facial images collected under the FERET program.

References (22)

  • R Chellappa et al.

    Human and machine recognition of faces: A survey

    Proc. IEEE

    (1995)
  • Cited by (122)

    • Low-resolution face recognition with single sample per person

      2017, Signal Processing
      Citation Excerpt :

      Face recognition for single sample per person (SSPP), i.e., conducting face recognition when there is only one sample of each subject in the training set, has been a topical issue for recent years, because its significant application in law enforcement. Many techniques have been proposed to deal with SSPP face recognition [1–6], and all of them so far are designed for high-resolution (HR) face images. In the last decade, for security and law enforcement purposes, an ever-increased number of surveillance cameras have been installed in public area.

    • Surveillance video face recognition with single sample per person based on 3D modeling and blurring

      2017, Neurocomputing
      Citation Excerpt :

      It was beyond question that video face recognition was still one of the primary sophisticated topics. Even so, as two of the most prominent subspace approach in feature extraction process for face recognition, both Principal Component Analysis (PCA) [18,19] and Fisher's Linear Discriminant Analysis (FLDA)[20,21] have always been employed in video face recognition field. Face recognition based on PCA was a technique based global face features.

    • Adaptive appearance model tracking for still-to-video face recognition

      2016, Pattern Recognition
      Citation Excerpt :

      In synthetic generation, multiple virtual face images are generated from each single reference still to enhance gallery-face-models. Multiple virtual views are synthesized by linear shape prediction [29], mesh warping [30], morphing [31], symmetry property [32], partitioning a face in several sub-images [33], affine transformation [34], noise perturbation [35], shifting [36], and active appearance model [37]. A recurring problem with the synthetic generation is that they need to locate facial components reliably to determine the pose angle for pose compensation.

    View all citing articles on Scopus
    View full text