Elsevier

Applied Soft Computing

Volume 53, April 2017, Pages 396-406
Applied Soft Computing

Face recognition under pose and illumination variations using the combination of Information set and PLPP features

https://doi.org/10.1016/j.asoc.2017.01.014Get rights and content

Highlights

  • This paper presents the illumination-invariant and pose-invariant face recognition method.

  • The Mamta-Hanman entropy function is made adaptive and its properties are provided.

  • It makes use of the adaptive Mamta-Hanman entropy in the formulation of Hanman transform and Shannon transform.

  • It develops several new features based on the Mamta-Hanman entropy function.

  • It combines other techniques to make the proposed recognition invariant to poses and illumination.

Abstract

This paper presents a new approach for face recognition under pose and illumination variations. The concept of information set is presented and the features based on this are derived using the Mamta-Hanman entropy function. The properties of an adaptive version of this entropy are given and nonlinear Shannon transform and Hanman transform which area higher form of information set are formulated. The information set based features and the nonlinear Shannon transform features are separately combined with the Pseudo-inverse Locality Preserving Projections (PLPP) for improving their effectiveness. The performance of the combined features is compared with that of the holistic approaches on four face databases (two FERET, one head pose image, and Extended Yale face database). The features from the combination of nonlinear Shannon transform and PLPP give consistent performance on the three databases tested whereas the well known features from the literature show good performance on one or two databases only.

Introduction

There are numerous applications centered around face recognition of which mention may be made of secure access to buildings, airports, ATM; general identity verification of passports, banking, licenses, employee IDs; tagging in images and videos; and surveillance. Moreover with the ever growing security concerns, the importance of face biometric has increased many-fold. Thus we see CCTVs in all commercial places to look for criminals in real time. Encouraged by these commercial applications of the automatic face recognition (AFR) we look for the most effective techniques.

Searching a person from images or videos needs a robust face recognition algorithm as it is highly susceptible to variations in illumination and poses. Hence, face recognition under the unconstrained conditions and from video footages is an active area of research now-a-days. Human beings have the ability to recognize faces even after several years, with or without objects on face, under varied expressions, poses and lighting conditions by remembering the previously stored information on the persons. This capability is very hard to provide to the machines. However the advances in the machine learning have given an impetus to cope with the face recognition under the unconstrained conditions.

A lot of work has been done on face recognition in the past few years and many face recognition techniques have been developed as can be seen in survey papers [28], [29], [30]. Most of them deal with the frontal faces under the controlled lighting conditions, with or without expressions and objects on the faces [5], [6], [7], [8], [9], [10], [11], [13], [14]. The main problem for the accurate face recognition is caused due to variation in pose, illumination, expression, occlusion and aging. A robust automatic face recognition system for pose variation is needed for security and surveillance purposes. Passwords based on retina, finger prints, voice and face recognition are considered to be more secure than PIN numbers, cards etc. which can be stolen or misplaced easily. As face is non-invasive acquiring a facial image is not difficult. Moreover the noise doesnt pose problems unlike occlusion and low resolution while processing the facial images.

There are three categories of approaches for face recognition: the geometry based, holistic and hybrid approaches. The geometry based approaches use expert-engineered geometry-based features; however, the holistic approaches use the entire image and/or generic high-level image features as input to the classification algorithms. The third category uses both geometric and holistic features.

The first category includes approaches such as Elastic Bunch Graph Model (EBGM) [1] and 3D face models. The study on various face recognition algorithms by Du and Ward [3] reveals that the best results are due to Blanz and Vetter [2] on FERET. A statistical morphable model of 3D faces learned from a set of textured 3D scans of heads is used. It estimates 3D shape and texture of a face from a single image. A set of six to eight standard feature points such as the corners of the eyes, the tip of the nose, corners of the mouth, ears, and around three points on the contour (cheeks, chin, and forehead) is selected. If any of these points fail to be located in some pose, then lesser number of points is used. Gourier et al. [26] have used two first-order and three second-order Gaussian derivatives at each pixel with respect to its neighborhood. These are normalized by the characteristic scale to derive robust facial features and clustered into regions of an appropriate facial structure thus making them robust towards illumination. The pose is estimated based on the relative image positions of salient image structures. As can be noticed the authors of [26] move from holistic features to geometric features. Limitations of geometry based approaches are: either they are based on the initial facial points selected or they need good equipment setups. Finding fiducial points automatically is in itself a challenging task. We are not pursuing the geometry based category in the present work but are attempting at the illumination and pose invariance.

The holistic approaches that are appearance based require dimensionality reduction. These techniques deal with the pixel intensities of the face only. They eliminate the background by adopting manual means or automatic methods so that the unwanted regions have no role in the training of the model. The plenoptic function or Eigen Light Field proposed by Gross et al. [4], makes use of radiance values emitted from an object in all directions called light fields. It is similar to Eigenface space construction but rather than using images as the input, the function uses light fields to project the training data. The advantage of this approach is that it can recognize any pose from a few images in the training set.

One of the most common methods, Principal Component Analysis (PCA) also called Eigenfaces method is by Turk and Pentland [8], [9]. In this method high dimensional face images represented as n-dimensional vectors are reduced into lower d-dimensional (d  n) vectors based on the maximum variance. This new subspace on which the higher dimensional vectors are projected is known as the face space. Independent Component Analysis (ICA) is another Eigenface approach. ICA finds better basis vectors than those of PCA by considering higher order relationships among pixels. It minimizes both second-order and higher-order dependencies in the input data. The basis vectors so found make the data statistically independent when projected on them. Two architectures of ICA by Bartlett et al. [10] are: (i) Architecture I − that treats images as random variables and pixels as outcomes and helps find spatially local basis images for the faces, (ii) Architecture II − that treats pixels as random variables and images as outcomes, producing a factorial code representation. Both the architectures give better performance than that of PCA when tested on FERET dataset.

Linear Discriminant Analysis (LDA) is a Fisherface approach [11], [12]. It minimizes the within-class differences and maximizes the between-class differences. In this regard we have two matrices: the first is between-class scatter matrix SB and the second is within-class scatter matrix SW. The goal is to minimize SW and maximize SB or maximize the ratio |SB|/|SW|. Etemad and Chellappa [11] have compared LDA with PCA. The limitations of these two methods are as follows: LDA aims at preserving the global structure while employing the local structure of importance in the recognition tasks. PCA and ICA lack in encoding the discriminant information.

Pang et al. [5] have proposed Neighborhood Preserving Projections (NPP) which unlike PCA and LDA utilizes the neighborhood information to learn global structure. It modifies the Locally Linear Embedding (LLE) by introducing a linear transform matrix. Li et al. [6] have used Discriminative Uncorrelated Neighborhood Preserving Projections (DUNPP). It preserves within the class neighborhood geometry structure and maximizes the distance between different classes. The extracted features are statistically uncorrelated. Wang et al. [7] have proposed Fisher Locality Preserving Projections (FLPP) by using the Maximum scatter difference criterion (MSDC) in the objective function of LPP. It not only preserves the local structure but also makes use of the class information for classification. FLPP is shown to outperform PCA, LDA, MSDC, NPP and LPP in [7].

Xiaofei et al. [13] have proposed Laplacian face approach called Locality Preserving Projections (LPP) which maps face images into a face subspace for analysis. As compared to PCA and LDA that only see the Euclidean structure of face space, LPP not only preserves the neighborhood structure but also obtains a face subspace containing the essential face structure. Being a linear mapping, it shares the properties of the nonlinear techniques like Laplacian Eigen maps.

Rong et al. [14] have proposed the dimensionality reduction algorithm, an improvement over LPP. It is called Pseudo-inverse Locality Preserving Projections (PLPP) that uses Moore-Penrose pseudo inverse matrix to find the inverse of singular matrices resulting from the under-sampled problems from Eigen equation. The simultaneous diagonalization of three matrices reduces the time complexity.

Giorghiades et al. [27] attempt pose and illumination invariant face recognition. To this end they consider a small number of training images under different illumination conditions and reconstruct the shape and albedo of a face which help determine the pose space. This pose space is sampled such that each pose corresponds to a set of illumination conditions termed as illumination cone which is approximated by a low-dimensional space whose basis vectors are estimated using the generative model.

An important class of methods is HMM based; some related papers are surveyed here. In [32] the face images are converted into a sequence of blocks modeled by HMM on which DCT is applied followed by PCA for dimensionality reduction. Each class is associated with a HMM as this method pertains to one sample face recognition. 2D DWT is used for feature extraction and 1D HMM for classification in [33]. The face recognition method in [34] uses 2D-DCT coefficients as the features and HMM considers hair, forehead, eyes, nose and mouth as its states for modeling. In the proposed approach, we avoid segmenting a face into these regions by partitioning into windows. 2D-distributed HMM (2D-DHMM) is used in [35] by extending the algorithms of EM (Expectation-Maximization), GFB (Genera Forward-Backward), and Viterbi for image segmentation and classification. In contrast to the HMM based approaches that require learning of model parameters our approach computes the needed parameters directly from the windows.

Bae and Kim [15] have used facial (geometric) features and Eigen (holistic) features to train a neural network model for real time face detection and recognition. Karimi and Task [16] have employed the global features such as eyes, mouth, nose, virtual top of head and chin and their ratios, which together are termed as hybrid facial features for the determination of age and gender. Jangid et al. [31] have proposed holistic features comprising information set based features derived from non-overlapping sub images of a face and geometric features comprising fiducial points and contour features. The hybrid of these two feature types after feature reduction by 2DPCA is employed for illumination invariant face recognition based on a single training image. The hybrid approaches involve multiple stages of processing. In the initial stages, face segmentation is done. Most of the face segmentation approaches are either based on color information or the detection of edges. Using the color models such as RGB, HSV or YCbCr, it is possible to extract skin region from images by setting certain color threshold ranges. But the limitation of these approaches is that they select objects that fall in these color ranges. Also, owing to variation in skin colors of people, deciding about the range is difficult. Another way to segment a face is by using edge detection or contouring methods to compute the contrast between the background and face. This approach needs to do a lot of rectification while determining and removing the unwanted edges. In this work we are not pursuing face segmentation rather a face is partitioned into windows for extracting the information set based features.

The above face recognitions suffer from computational complexity and low accuracy under unconstrained conditions such as pose and illumination variations. The first motivation is to seek robust features capable of handling the unconstrained conditions. Though robust features have been used in [26] and [27] but they involve high computation complexity. The second motivation is to reduce time complexity of the existing face recognition methods. To this end we explore the computationally simpler information set based features that have been found robust under the unconstrained conditions in [19]. In addition to these, we will also investigate the effectiveness of the higher form of the information set based features. Next we will combine Pseudo-inverse Locality Preserving Projections (PLPP) with the information set based features as this combination is shown to reduce the computational time drastically. Note that Local Principal Independent Components (LPIC) approach presented in [19] is similar to PLPP but it has no provision for preserving the neighborhood structure for the dimensionality reduction.

The paper is organized as follows: Section 2 gives the formulation of Information set based features. Section 3 presents the adaptive Mamta-Hanman entropy function and its properties. It also provides the derivation of non-linear Shannon transform and Hanman transform for face recognition. Section 4 describes PLPP algorithm for the dimensionality reduction, the illumination invariant approach and the matching algorithm. Section 5 discusses the results of experiments on four databases. The conclusions are given in Section 6.

Section snippets

Derivation of information set features

The concept of information set is introduced by Hanmandlu [17] based on the information theoretic entropy function named as the Hanman-Anirban entropy function which owes its origin to Hanmandlu and Anirban Das in [18]. This concept has been utilized by Mamta and Hanmandlu in [19] to derive information set based features such as energy, effective information, sigmoidal or multi quadratic features from non-overlapping windows of an ear image. These features are employed in Local Principal

Adaptive Mamta-Hanman entropy function

Considering the parameters in Eq. (1) as functions rather than constants as per the original definition in [20], the adaptive form of H is written asH=Iijαe(cijIij+dij)β

Now the parameters cij and dij are variables. The 1D form of this function [20] is:H=i=1npiαI(pi)

where I(pi)=e(cipiα+di)β with ci, di ϵ [0,1]. We will prove some important properties of the adaptive entropy function. In order to simplify proofs we set α = 1.

PLPP algorithm

For n feature vectors each of dimension m are arranged in the matrix of size X∊Rm × n. The steps involved in PLPP are briefly described now, and shown in Fig. 1(a):

Step 1: Construct the weighted k nearest neighborhood-graph from WG = (X, S) where xi (each data sample) is the node and S is an affinity matrix that represents the local structure. An element of S matrix is computed from:Sij=exixj2twhere t is a threshold, xj is among the k nearest neighbors of xi otherwise Sij is 0. Instead of Sij,

Results of experimentation

The experiments are carried out on Intel core i5 processor with 2.67 GHz and 3GB of RAM. We have used MATLAB R2010a as our coding platform and PhD toolbox [24], [25] for LDA. The matching algorithm is tested on three datasets. In order to ascertain the performance of the proposed face recognition method in terms of accuracy and the processing time, k-fold cross validation is done. The window size of 13 × 13 is selected by experimentation. The recognition accuracy is counted based on the correct

Conclusions

It is imperative to have a fast and robust method for face recognition as in surveillance applications. In this paper, the information-set based features are formulated to overcome the drawbacks of the prevailing holistic methods. The face recognition is attempted under pose, illumination and resolution variations by roping in Weber-LDP images for illumination invariance, PLPP for pose invariance in conjunction with the information set based features,

Combining PLPP with either the information

References (35)

  • Wei Li et al.

    Discriminative uncorrelated neighborhood preserving projection for facial expression recognition

    IEEE 11th International Conference on Signal Processing (ICSP), 21–25 Oct.

    (2012)
  • Guoqiang Wang et al.

    Fisher locality preserving projections for face recognition*

    J. Comput. Inf. Syst.

    (2013)
  • M. Turk et al.

    Eigenfaces for recognition

    J. Cog. Neurosic.

    (1991)
  • M.A. Turk et al.

    Face recognition using eigenfaces

  • M.S. Bartlett et al.

    Face recognition by independent component analysis

    IEEE Trans. Neural Netw.

    (2002)
  • K. Etemad et al.

    Discriminant analysis for recognition of human face images

    J. Opt. Soc. Am. A

    (1997)
  • P.N. Belhumeur et al.

    Eigenfaces vs. fisherfaces, recognition using class specific linear projection

  • Cited by (7)

    View all citing articles on Scopus
    View full text