Elsevier

Pattern Recognition

Volume 38, Issue 2, February 2005, Pages 209-219
Pattern Recognition

Face recognition using direct, weighted linear discriminant analysis and modular subspaces

https://doi.org/10.1016/j.patcog.2004.07.001Get rights and content

Abstract

We present a modular linear discriminant analysis (LDA) approach for face recognition. A set of observers is trained independently on different regions of frontal faces and each observer projects face images to a lower-dimensional subspace. These lower-dimensional subspaces are computed using LDA methods, including a new algorithm that we refer to as direct, weighted LDA or DW-LDA. DW-LDA combines the advantages of two recent LDA enhancements, namely direct LDA (D-LDA) and weighted pairwise Fisher criteria. Each observer performs recognition independently and the results are combined using a simple sum-rule. Experiments compare the proposed approach to other face recognition methods that employ linear dimensionality reduction. These experiments demonstrate that the modular LDA method performs significantly better than other linear subspace methods. The results also show that D-LDA does not necessarily perform better than the well-known principal component analysis followed by LDA approach. This is an important and significant counterpoint to previously published experiments that used smaller databases. Our experiments also indicate that the new DW-LDA algorithm is an improvement over D-LDA.

Introduction

Despite the availability of commercial systems, face recognition continues to be an active topic in computer vision research. Current face recognition systems perform well under nearly ideal circumstances, but tend to suffer when variations in expression, illumination, decoration (i.e., glasses, facial hair), and/or pose are present. Most current face recognition research aims to improve recognition performance in the presence of such confounding factors. Face recognition methods can be classified broadly into two categories: feature- or template-based, as described in Ref. [1]. The research presented in this paper is a template-based approach where the image pixels themselves serve as the features. Below we survey the research that has motivated our work. We note that there are other confounding factors (aging and/or drastic weight change, for example) that are beyond the scope of our interest. Furthermore, there are many other interesting and effective face recognition approaches that are not related to this work. A few examples include elastic bunch graph matching [2], support vector classification [3], morphable models [4], and light-fields [5]; certainly, the interested reader can find many more.

Illumination: Approaches for dealing with varying illumination are primarily based upon linear discriminant analysis (LDA), sometimes referred to as “Fisherfaces” [6], [7], [8]. A motivating principle behind these techniques is the approximation of a face as a Lambertian surface. As noted in Ref. [6], the images of a Lambertian surface under varying illumination lie in a linear subspace of the entire image space and, under ideal conditions, are linearly separable.

Expression: Varying facial expression can be modeled to some degree by the active appearance models (AAMs) presented in Ref. [9]. AAMs characterize shape and texture information using a statistical point distribution approach.

Illumination and expression: Bayesian face recognition [10], [11], [12] has been proposed to improve robustness in the presence of varying illumination and expression. These approaches employ probabilistic models to characterize intra-personal and inter-personal differences with a principal component analysis (PCA) or “eigenface” representation. In Ref. [10], it is noted that the Bayesian approach can be thought of as a general, non-linear extension of LDA. With this in mind, it seems a reasonable hypothesis that LDA should also be able to address both illumination and expression to some degree. Recent research [13] has demonstrated this hypothesis to be true.

Facial decoration: There has been very little work towards explicitly handling facial decoration. In Refs. [12] and [14], it was shown that two “eigenfeature” images—the eyes and the nose—could be used for accurate recognition after a change in facial hair. However, no method for online selection of the appropriate eigenfeatures was suggested. In the LDA approach described in Ref. [7], some promising results were obtained after artificially degrading face images, indicating that LDA might also provide a reasonable solution to handling some degree of decoration, assuming that the registration landmarks can still be located (i.e., no occlusions of landmarks such as dark glasses hiding the eyes or scarves covering the mouth).

Pose: One method to handle varying pose is the view-based eigenspace approach [14], which was recently shown to perform quite well [15]. Each pose is represented by its own subspace and the multiple subspaces act as independently trained “experts” or observers trying to explain the data. Similarly, motivated techniques include characteristic eigenspace curves [16] and view-based AAMs [17].

Pose and expression: AAM methods [17], [18] have been proposed to handle both varying pose and expression.

Pose and illumination: Methods were presented in Refs. [19] and [20] to deal with varying pose and illumination. These methods rely upon generative models that can synthesize a given face under varying illumination from different viewpoints. Although the performance in Ref. [19] is quite remarkable, the proposed method employs seven training images for each subject under strictly controlled lighting and does not address expression or decoration.

The research presented in this paper is motivated by the goal of personnel monitoring in critical spaces of secure facilities. In these situations, we will need to recognize between 100 and 150 people and will have access to good training data. Variations in illumination, expression, and decoration (particularly eyeglasses) are expected. Since access to the spaces in question is generally well- controlled and monitored by video cameras, the acquisition of frontal images is relatively easy compared to less controlled situations, hence pose variation issues are minimal. With all of these facts in mind, we now note the specific contributions of this paper.

  • We propose a modular LDA face recognition algorithm, which is an improvement over the modular PCA approach. Through careful analysis of previous research, our approach explicitly aims to address three of the four confounding factors, namely illumination, expression, and decoration. None of the algorithms presented above addresses more than two confounding factors. Assuming an accurate pose estimator (a subject of ongoing research) and adequate training data, we believe the extension of the proposed system to variable pose is straightforward, as discussed briefly in Section 3.4.

  • We propose a new LDA algorithm called direct, weighted LDA (DW-LDA) that simultaneously provides the advantages of both direct LDA [21] and weighted pairwise Fisher criteria [22]. A point of significant interest is that we find experimentally that the direct LDA methods do not perform as well in terms of classification accuracy as PCA plus LDA methods. This is in contrast with earlier results in the literature [21]. The direct LDA methods do, however, provide the means to perform subspace computation when there is abundant training data (i.e., no small sample size problem) and many subjects, where PCA might be computationally intractable due to the dimensionality of the full rank covariance matrix. For example, if we had 1000 subjects and 10,000 training images of 10,000 pixels each, PCA would require the eigen-decomposition of a 10,000×10,000 matrix. D-LDA and DW-LDA, however, would only require the eigen-decomposition of a 1000×1000 matrix.

  • In Section 2.3, we note what seems to be contradiction between direct LDA and the weighted, pairwise Fisher criteria regarding the importance of the nullspace of the within-class scatter matrix. (Understanding this contradiction is the subject of ongoing research.)

  • We provide experimental results comparing several subspace approaches, both with and without classifier combination. These experiments are the first for modular LDA and, for modular PCA, provide results on a larger database than has been previously published. These results demonstrate the significant performance improvements achievable using simple classifier combination with modular LDA (or modular PCA) subspaces. Perhaps most importantly, these results are the first to indicate that, although computational benefits are indeed provided, direct LDA methods do not necessarily perform better than PCA-first methods in terms of classification accuracy.

  • We describe the computation and use of a simple confidence metric. We show experimentally how this confidence metric can be employed to significantly improve accuracy in situations where multiple observations of a given subject are expected.

The remainder of this paper is organized as follows. In Section 2, we first review traditional LDA (which we will refer to as T-LDA), direct LDA (D-LDA), and weighted LDA (W-LDA). We then present an algorithm that combines both direct and weighted LDA in a unified algorithm we refer to as DW-LDA. In Section 3, we present the multiple-observer, modular LDA subspace system. We then provide some experimental results in Section 4 and conclude in Section 5 with some closing remarks.

Section snippets

DW-LDA

The aim of traditional LDA (T-LDA) is to project high-dimensional feature vectors in Rn onto a lower-dimensional subspace Rm, where m<n, while preserving as much discriminative information as possible. One formal expression for the corresponding optimization criterion (see Ref. [23] for equivalents) can be writtenargmaxAtr(ATSbA)tr(ATSwA),where ARn×m is the projection matrix we seek, tr(·) is the trace operator, SwRn×n is the within-class scatter matrix, and SbRn×n is the between-class

Modular Fisherfaces

Perhaps, the earliest suggestions for the use of modular subspaces can be found in Ref. [1] and also in the view-based and modular eigenspaces of Ref. [14]. It is noted in Ref. [14], and observed in much research since, that recognition from a frontal face image is sensitive to changes in expression, decoration, and illumination. By decomposing the full face image into modular subregions, Ref. [14] shows that improved accuracy can be obtained with respect to expression and decoration variation.

Experimental results

In this section, we present experimental results using our modular system and compare the performance of several subspace projection algorithms. Recall from Section 1, that our target application requires recognition of between 100 and 150 people, with good training data available for each person. With this fact in mind, we selected two publicly available databases that contained data most appropriate to our problem of interest. The first database was the CVL database [30], which nominally

Conclusions

In this paper, we present a modular LDA subspace approach for template-based face recognition that performs significantly better than traditional “eigenfaces” or “Fisherfaces.” This approach is specifically aimed at addressing three (illumination, expression, and decoration) of the four confounding factors of interest. Although pose is not specifically addressed in the presented work, we briefly describe how the system might be extended to handle pose variation. We also present a new LDA-based

Acknowledgements

This research was sponsored by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the US Department of Energy under Contract No. DE-AC05-00OR22725.

About the Author—JEFF PRICE received the B.S.E.E. degree from the US Naval Academy in 1993, and the M.S. and Ph.D. degrees in Electrical Engineering from the Georgia Institute of Technology in 1997 and 1999, respectively. Dr. Price is currently with the Image Science and Machine Vision Group at Oak Ridge National Laboratory.

References (31)

  • W. Zhao, R. Chellappa, P.J. Phillips, Subspace linear discriminant analysis for face recognition, Technical Report...
  • G.J. Edwards, T.F. Cootes, C.J. Taylor, Face recognition using active appearance models, in: Proceedings of the...
  • B. Moghaddam et al.

    Bayesian face recognition using deformable intensity surfaces

  • B. Moghaddam et al.

    Probabilistic visual learning for object representation

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1997)
  • C. Liu et al.

    Gabor feature based classification using the enhanced Fisher linear discriminant model for face recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2002)
  • Cited by (0)

    About the Author—JEFF PRICE received the B.S.E.E. degree from the US Naval Academy in 1993, and the M.S. and Ph.D. degrees in Electrical Engineering from the Georgia Institute of Technology in 1997 and 1999, respectively. Dr. Price is currently with the Image Science and Machine Vision Group at Oak Ridge National Laboratory.

    About the Author—TIM GEE received the B.S.E.E. degree from Auburn University in 1992, and the M.S. degree in Electrical Engineering from the Georgia Institute of Technology in 1993. Mr. Gee is currently a member of the Image Science and Machine Vision Group at Oak Ridge National Laboratory.

    View full text