Elsevier

Pattern Recognition

Volume 40, Issue 5, May 2007, Pages 1520-1532
Pattern Recognition

Learning the best subset of local features for face recognition

https://doi.org/10.1016/j.patcog.2006.09.009Get rights and content

Abstract

We propose a novel, local feature-based face representation method based on two-stage subset selection where the first stage finds the informative regions and the second stage finds the discriminative features in those locations. The key motivation is to learn the most discriminative regions of a human face and the features in there for person identification, instead of assuming a priori any regions of saliency. We use the subset selection-based formulation and compare three variants of feature selection and genetic algorithms for this purpose. Experiments on frontal face images taken from the FERET dataset confirm the advantage of the proposed approach in terms of high accuracy and significantly reduced dimensionality.

Introduction

Face recognition has proved to be a difficult problem in computer vision. The main reason for this is that intra-personal variations caused by facial expressions, view point changes, and illumination variations are significant when compared to inter-personal variations. Many researchers have therefore focused on face representation techniques that are invariant to some of these variations [1], [2]. These can be grouped into two, as holistic and local feature-based: In the first type, faces are represented as a whole and statistical techniques are used to extract features from faces [3], [4], [5], [6]. The second type depends on the localization of salient facial features such as the eyes and the mouth [7], [8], [9]. There are also hybrid approaches which incorporate complementary knowledge from both [10]. Our work in this paper belongs to the second type, with the distinction that the salient local regions are not predicted but are learned from data.

The main idea in a feature-based face representation scheme is the extraction and analysis of local facial features. Salient facial features are first found and then used to code a face. Coding is generally carried out using geometric relationships between these points and extracting local image descriptions around these points. Among different alternatives, 2D Gabor-like filters are found to be very suitable as local descriptors because of their robustness against translation, rotation, and scaling [7], [11], [12]. 2D Gabor wavelets are selective to different orientations and spatial frequencies. Typically, features extracted by 2D Gabor wavelets have a very large dimensionality. It is therefore essential to analyze the contribution of each feature component to the recognition performance. Important parameters of 2D Gabor wavelets are: (1) spatial location of the kernel in the image; (2) kernel orientation; and (3) spatial kernel frequency.

Several studies have concentrated on examining the importance of the Gabor kernel parameters for face analysis. These include: the weighting of Gabor kernel-based features using the simplex algorithm for face recognition [13], the extraction of facial subgraph for head pose estimation [14], the analysis of Gabor kernels using univariate statistical techniques for discriminative region finding [15], the weighting of elastic graph nodes using quadratic optimization for authentication [8], the use of principal component analysis (PCA) to determine the importance of Gabor features [16], boosting Gabor features [17] and Gabor frequency/orientation selection using genetic algorithms [18].

In almost all previous studies, we see two fundamental assumptions: first, the contribution of each feature dimension is analyzed independently of others (independence assumption); and second, Gabor kernel placement over the face region is strongly affected by prior knowledge (saliency assumption). Placing the kernel at visually salient facial points, e.g., eyes, mouth, etc. is one of the frequently used methods. The first assumption of independence of features is not valid, and one should incorporate more complex methodologies to analyze the relationship between the features. Moreover, the effectiveness of the fiducial points should also be studied systematically, and a better solution would be to learn these locations from given training data for a given task. In our previous work, we have analyzed topographically important facial locations for both pose estimation and identity recognition [19], and used feature selection methods to extract optimal local image descriptor parameters for frontal face recognition [20]. We have also used such features to calculate bottom-up saliency in a selective attention-based face recognizer [21].

In this work, our aim is to relax the independence and saliency assumptions for face recognition by reformulating the optimal Gabor basis extraction problem as a feature subset selection problem. Doing this, we allow our approach to detect more complex relationships and correlations between feature dimensions, thus extracting a near-optimal Gabor basis. For this purpose, we have devised a two-stage subset selection mechanism: In the first stage, a genetic algorithm is used to find the most informative facial locations. In the second stage, a floating search method is used to learn the individual parameters, that is, frequency and orientation, of Gabor wavelet-based local descriptors.

The remainder of this paper is organized as follows: Section 2 describes the proposed approach and experimental results, including a sensitivity analysis, are presented in Section 3. We conclude and discuss future research directions in Section 4.

Section snippets

Proposed approach: learning the best features

We have designed a local feature-based face representation scheme for recognition. Multi-frequency and multi-orientation 2D Gabor wavelets are used as local feature extractors [7], [11]. In order to find an efficient representation, these local image descriptors should be placed carefully over the face region. Moreover, depending on the locations of these image descriptors, useful frequencies and orientations should be found since specific parts of a face contain high frequency information

Experimental results

In our experiments, we have used a subset of the FERET face database [23] which contains subjects having four images. The database contains normalized frontal images of 146 subjects. Each subject has four gray scale images of resolution 150×130. Faces contain facial expression and illumination variations. Each session contains two training, one validation, and one test image and therefore there are six possible experimental sessions: {S1,S2,,S6}. After training with two images per person, the

Conclusion and discussion

We present a new form of local feature-based face representation technique which is able to consider local feature dependencies present in faces and allows better feature extraction. Our main contribution is to reformulate the representation task as a subset selection problem. We have shown that it is possible to reach an accurate and simple facial feature set by learning informative locations from training data instead of assuming regions of saliency a priori, and by taking the dependencies of

Acknowledgments

This work is supported by Boğaziçi University research Grants 03K120250 and 02A104D. A preliminary version of this work is presented at the IEEE Int. Conference on Image Processing, August 2003, Barcelona [20]. E. Alpaydın is also supported by the Turkish Academy of Sciences, in the framework of the Young Scientist Award Program (EA-TÜBA-GEBİP/2001-1-1).

About the Author—BERK GÖKBERK received his B.S. and M.S. degrees in computer science from Boğaziçi University, Istanbul, Turkey in 1999 and 2001, respectively. He is currently a Ph.D. student at the Department of Computer Engineering, Boğaziçi University. His research interests include biometrics, 2D/3D face recognition, pattern recognition, and computer vision.

References (23)

  • A.M. Martinez et al.

    PCA versus LDA

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2001)
  • Cited by (24)

    • Pixel selection based on discriminant features with application to face recognition

      2012, Pattern Recognition Letters
      Citation Excerpt :

      In several papers, a face image was partitioned into a set of sub-images, and then local features were extracted from these sub-images (Gottumukkal and Asari, 2004; Tan and Chen, 2005). The information obtained from these local features was used with (Kim et al., 2005; Pentland et al., 1994) or without global features (Gokberk et al., 2007; Rajagopalan et al., 2007) to improve the accuracy and robustness in the face recognition process. In this paper, we propose a pixel selection method based on discriminant features for face recognition.

    • Methodological improvement on local Gabor face recognition based on feature selection and enhanced Borda count

      2011, Pattern Recognition
      Citation Excerpt :

      A Borda count method is used to compare the inner product among the Gabor jets [43,44] between the input face image and the Gabor jets from faces in the gallery. In [45], Gabor feature selection is performed with different methods including genetic algorithms (GA) [46,47]. Using this GA method, the 15 most relevant coordinates for Gabor features are selected and the fitness function is the recognition performance.

    • A review of face recognition methods

      2013, International Journal of Pattern Recognition and Artificial Intelligence
    • Fast Evolutionary Algorithm based Identifying Surgically Distorted Face for Surveillance Application

      2022, International Conference on Sustainable Computing and Data Communication Systems, ICSCDS 2022 - Proceedings
    • Constructing Effective UVM Testbench By Using DRAM Memory Controllers

      2022, Proceedings of the 2nd International Conference on Artificial Intelligence and Smart Energy, ICAIS 2022
    View all citing articles on Scopus

    About the Author—BERK GÖKBERK received his B.S. and M.S. degrees in computer science from Boğaziçi University, Istanbul, Turkey in 1999 and 2001, respectively. He is currently a Ph.D. student at the Department of Computer Engineering, Boğaziçi University. His research interests include biometrics, 2D/3D face recognition, pattern recognition, and computer vision.

    About the Author—M. OKAN İRFANOĞLU received his B.S. degree in industrial engineering and M.S. degree in computer science both from Boğaziçi University, Istanbul, Turkey in 2002 and 2004, respectively. He is currently a Ph.D. student at the The Ohio State University Computer Sciences and Engineering Department.

    About the Author—LALE AKARUN received the B.S. and M.S. degrees in electrical engineering from Boğaziçi University, Istanbul, Turkey, in 1984 and 1986, respectively, and the Ph.D. degree from Polytechnic University, Brooklyn, NY, in 1992. From 1993 to 1995, she was Assistant Professor of electrical engineering at Boğaziçi University, where she is now Professor of computer engineering. Her current research interests are in image processing, computer vision, and computer graphics.

    About the Author—ETHEM ALPAYDIN received his Ph.D. degree in computer science from Ecole Polytechnique Federale de Lausanne, Switzerland in 1990 and did postdoctoral work at the International Computer Science Institute (ICSI), Berkeley in 1991. Since then he has been teaching in the Department of Computer Engineering at Boğaziçi University, Istanbul, where he is currently professor. He had visiting appointments at MIT in 1994, ICSI in 1997 (as a Fulbright scholar) and IDIAP, Switzerland in 1998. He received the young scientist award from the Turkish Academy of Sciences in 2001 and the scientific encouragement award from the Turkish Scientific and Technical Research Council in 2002. His book, Introduction to Machine Learning, has recently been published by The MIT Press. He is a senior member of the IEEE and the IEEE Computer Society.

    View full text