Pose-invariant face recognition with homography-based normalization

doi:10.1016/j.patcog.2016.11.024

Pattern Recognition

Volume 66, June 2017, Pages 144-152

https://doi.org/10.1016/j.patcog.2016.11.024 Get rights and content

Highlights

•
We propose a highly efficient and accurate pose normalization approach for pose-invariant face recognition.
•
This is the first time that homography is utilized for face synthesis.
•
The proposed approach covers the full range of pose variations within ±90° of yaw.
•
The proposed approach outperforms existing methods on four popular face databases.

Abstract

Pose-invariant face recognition (PIFR) refers to the ability that recognizes face images with arbitrary pose variations. Among existing PIFR algorithms, pose normalization has been proved to be an effective approach which preserves texture fidelity, but usually depends on precise 3D face models or at high computational cost. In this paper, we propose an highly efficient PIFR algorithm that effectively handles the main challenges caused by pose variation. First, a dense grid of 3D facial landmarks are projected to each 2D face image, which enables feature extraction in an pose adaptive manner. Second, for the local patch around each landmark, an optimal warp is estimated based on homography to correct texture deformation caused by pose variations. The reconstructed frontal-view patches are then utilized for face recognition with traditional face descriptors. The homography-based normalization is highly efficient and the synthesized frontal face images are of high quality. Finally, we propose an effective approach for occlusion detection, which enables face recognition with visible patches only. Therefore, the proposed algorithm effectively handles the main challenges in PIFR. Experimental results on four popular face databases demonstrate that the propose approach performs well on both constrained and unconstrained environments.

Introduction

Face recognition is one of the most important biometric techniques. It has wide potential in many real-world applications, e.g., video surveillance, access control systems, forensics and security, and social networks [1], [2], [3], [4], [5], [6], [7], [8], [9]. The key advantage of face recognition lies in its non-intrusive property, which means it can work in a passive manner. However, the downside of this property is that the appearance of face images is vulnerable to a number of factors, e.g., pose, illumination, occlusion, and expression variations [10]. In particular, pose variation is the primary stumbling block to realizing the full potential of face recognition, as argued in a recent survey [11]. In this paper, we study the pose-invariant face recognition (PIFR) problem, which targets at recognizing face images captured under arbitrary poses.

Pose variation dramatically changes the appearance of face images. The appearance difference caused by pose variations usually exceeds the intrinsic appearance difference between subjects. As illustrated in Fig. 1, pose variation results in displacement of facial components, non-linear texture warping, and self-occlusion. Besides, pose variation is often combined with other factors, e.g, image blur and illumination variation, to jointly affect face recognition, as shown in Fig. 2. To handle these challenges, a number of PIFR approaches have been proposed. Among existing approaches, pose normalization is advantageous as it produces pose-free faces with high fidelity, and usually requires no training data. Existing pose normalization approaches can be divided into two categories: 2D methods [12], [13], [14] and 3D methods [15], [16], [17], [18]. As the face is essentially a 3D object, the appearance change caused by pose variation can be modeled more accurately with an ideal 3D face model. However, 3D modeling from a single 2D face image is an ill-posed problem and thus difficult in practice. Another disadvantage of 3D methods is that they depend on complicated computer graphics techniques for face image rendering. In comparison, 2D methods conduct pose normalization within the 2D image domain. Due to the lack of one degree of freedom, accurate pose normalization within the 2D image domain is difficult. Existing 2D methods usually adopt computationally expensive algorithms, e.g., Markov Random Fields (MRF) [14], Lucas–Kanade algorithm [12], to promote accuracy in pose normalization.

In this paper, we propose a novel pose normalization approach that combines the advantages of both 3D methods and 2D methods. In our approach, a grid of dense 3D facial landmarks are projected to the 2D image by aligning five semantically corresponding facial landmarks between the face image and a generic 3D face model. The grid of facial landmarks efficiently establishes dense correspondence of face images across pose. Next, by assuming the local patch around each facial landmark is a simple planar surface, the transformation of the local patches across pose is efficiently approximated by homography based on landmarks in the patch. With the estimated transformation, the non-linear texture warping across pose is corrected. Compared with existing 2D pose normalization methods, e.g., Markov Random Fields (MRF) [14], Lucas–Kanade algorithm [12], the homography-based method estimates the local warp quite efficiently.

The above method reconstructs frontal-view face image patches from un-occluded facial textures. Existing feature extraction methods, e.g., local descriptors, can be applied on the corrected face patches to compose the face representation. Therefore, occlusion detection is important to distinguish occluded facial textures from visible facial textures. We further propose a method for occlusion detection and a scheme to extract fixed-length face representations from pose varied face images. Based on face symmetry, we extract patch-level features from both the original face image and the horizontally flipped version. For each patch pair of the two images, their features are fused by weighting according to their visibility. The patch-level feature vectors are then concatenated to compose the complete face representation. In this way, we obtain a fixed-length face representation for each face, regardless of their poses. The advantage of this method is that we can make the best of visible facial textures for face recognition.

In this paper, we term the homography-based pose normalization method as HPN. The remainder of the paper is organized as follows: Section 2 briefly reviews related works for PIFR. The proposed HPN method is illustrated in Section 3. Face representation based on HPN is described in Section 4. Experimental results are presented in Section 5, leading to conclusions in Section 6.

Section snippets

Related works

A number of approaches have been proposed to solve the PIFR problem from various perspectives. Among existing works, pose-robust feature extraction and pose normalization are the two most important categories of methods. For a comprehensive review of existing methods, we direct readers to a recent survey [11]. In this section, we only review the most relevant works to this paper.

Methods falling in the pose-robust feature extraction category can be further divided into two types: handcrafted

Homography-based pose normalization

In this section, we describe the HPN approach for patch-wise frontal-view synthesis. The main idea is assuming a local patch on the face is a planar surface; therefore its different views across pose are related by homography, whose parameters can be estimated from a set of semantically corresponding facial landmarks. The flowchart of HPN is illustrated in Fig. 3, Fig. 4. First, we align each 2D face image and a generic 3D face model by orthogonal projection and obtain a dense grid of pose

HPN-based face representation

We extract features from each synthesized frontal-view patch, respectively. The type of features is flexible. In this paper, we mainly employ the Dual-Cross Patterns (DCP) [33] descriptor for feature extraction. In detail, for each pose normalized patch, we extract DCP histogram feature from a cell of size $N \times N$ pixels ( $N \leq M$ ) centering around the central landmark. The DCP feature vectors extracted from all patches are concatenated to form the representation of the face image.

Pose variation results

Experiments

In this section, we conduct extensive experiments to justify the effectiveness of HPN. Two categories of experiments are conducted. First, face identification experiments are conducted on the three most popular databases for the PIFR research, i.e., FERET [34], CMU-PIE [35], and Multi-PIE [36]. Images in the three databases were captured under laboratory environments. The gallery set for each database is composed of frontal face images. Probe images are divided into different sets according to

Conclusion

Wide-range pose variation is a major challenge for fully-automatic face recognition. Among existing approaches, pose normalization is an effective solution, because it reserves high-fidelity facial textures at no cost of training data. In this paper, we propose a highly efficient pose normalization approach named HPN which is based on homography. HPN effectively handles the three major challenges for PIFR, i.e., loss of semantic correspondence, non-linear facial texture warping, and occlusion;

Acknowledgment

This work is supported by Australian Research Council Projects FT-130101457 and DP-140102164.

Changxing Ding received the Ph.D. degree from the University of Technology Sydney, Australia. His research interests include computer vision, machine learning, and especially focus on face recognition.

References (50)

A. Hadid et al.
Combining appearance and motion for face and gender recognition from videos
Pattern Recognit.
(2009)
H. Han et al.
A comparative study on illumination preprocessing in face recognition
Pattern Recognit.
(2013)
X. Tan et al.
Face recognition from a single image per persona survey
Pattern Recognit.
(2006)
R. He et al.
Learning predictable binary codes for face indexing
Pattern Recognit.
(2015)
W. Ou et al.
Robust face recognition via occlusion dictionary learning
Pattern Recognit.
(2014)
D. Jiang et al.
Efficient 3d reconstruction for face recognition
Pattern Recognit.
(2005)
R. Gross et al.
Multi-pie
Image Vis. Comput.
(2010)
S.R. Arashloo et al.
Pose-invariant face recognition by matching on multi-resolution mrfs linked by supercoupling transform
Comput. Vis. Image Underst.
(2011)
Z. Huang et al.
Face recognition on large-scale video in the wild with hybrid Euclidean-and-Riemannian metric learning
Pattern Recognit.
(2015)
R. He et al.
Maximum correntropy criterion for robust face recognition
IEEE Trans. Pattern Anal. Mach. Intell.
(2011)

F. Schroff, D. Kalenichenko, J. Philbin, Facenet: a unified embedding for face recognition and clustering, in:...

R. He et al.

Two-stage nonnegative sparse representation for large-scale face recognition

IEEE Trans. Neural Netw. Learn. Syst.

(2013)

Y. Sun et al.

Complementary cohort strategy for multimodal face pair matching

IEEE Trans. Inf. Forensics Secur.

(2016)

D.F. Smith et al.

Face recognition on consumer devicesreflections on replay attacks

IEEE Trans. Inf. Forensics Secur.

(2015)

C. Ding et al.

A comprehensive survey on pose-invariant face recognition

ACM Trans. Intell. Syst. Technol.

(2016)

A.B. Ashraf, S. Lucey, T. Chen, Learning patch correspondences for improved viewpoint invariant face recognition, in:...

H. Gao, H.K. Ekenel, R. Stiefelhagen, Pose normalization for local appearance-based face recognition, in: Proceedings...

H.T. Ho et al.

Pose-invariant face recognition using Markov random fields

IEEE Trans. Image Process.

(2013)

V. Blanz et al.

Face recognition based on fitting a 3d morphable model

IEEE Trans. Pattern Anal. Mach. Intell.

(2003)

M.W. Lee et al.

Pose-invariant face recognition using a 3d deformable model

Pattern Recognit.

(2003)

C. Ding et al.

Multi-task pose-invariant face recognition

IEEE Trans. Image Process.

(2015)

Z. Cao, Q. Yin, X. Tang, J. Sun, Face recognition with learning-based descriptor, in: Proceedings of IEEE Conference on...

D. Chen, X. Cao, F. Wen, J. Sun, Blessing of dimensionality: high-dimensional feature and its efficient compression for...

D. Yi, Z. Lei, S.Z. Li, Towards pose robust face recognition, in: Proceedings of IEEE Conference on Computer Vision and...

S.R. Arashloo et al.

Energy normalization for pose-invariant face recognition based on mrf model image matching

IEEE Trans. Pattern Anal. Mach. Intell.

(2011)

Cited by (124)

An investigational FW-MPM-LSTM approach for face recognition using defective data
2023, Image and Vision Computing
Facial recognition systems are based on the features and traits of the face, since the systems are classified as biometric systems. Additionally, they are founded on the image processing, machine vision and machine learning principles. From images, imperfect information is considered by face recognition systems. A variety of image reconstruction mechanisms is vital in this situation in order to match faces. The proposed method calls for image enhancement at the pre-processing stage. Following the image segmentation and reconstruction stage, the best facial features are extracted using features such the eyes, cheeks, face area and lips. By means of fractal model and wavelet transform the operation is performed. Using the Moore Penrose Matrix, the LSTM neural network is then improved also known as the MPM-LSTM, to train and test the system. From experimental results, the outcomes show that the proposed methodology performs better than the contemporary techniques.
PSGAN: Revisit the binary discriminator and an alternative for face frontalization
2023, Neurocomputing
Generative adversarial network (GAN) based face frontalization is a cheap and convenient way to eliminate the impact of pose variance on face recognition. The sigmoid cross-entropy loss function is usually employed for the discriminator in those GAN based face synthesis methods. There are two disadvantages for this loss function: 1) The discriminator always wins the generator easily at the beginning of training because the convergence of the discriminator and the generator is unbalanced; 2) The training of GANs becomes unstable due to the prediction boundary uncertainty and massive parameters of the traditional binary discriminator. In order to eliminate the impacts caused by the traditional discriminator in the general GANs, a Bayesian induced perceptual self-representation discriminator (i.e. PSD) is proposed, which can also maintain the identity information, and simultaneously reduce the model parameters and training difficulty. There are three key contributions in this work: 1) On the basis of PSD, a perceptual self-representation GAN (i.e. PSGAN) with a new architecture is proposed, which reduces the training difficulty without lowering the synthetic quality; 2) In order to further improve the performance of our method, multiple features extracted from different layers are adopted to constitute a multi-perceptual self-representation discriminator (i.e. MPSD); 3) The proposed PSD discriminator is more lightweight with fewer parameters and can also be easily plugged and played in various GANs. Extensive qualitative and quantitative experiments on both restricted and unrestricted face databases and non-facial datasets demonstrate its superiority.
MVS-STRNet: Cross-view Space Target Recognition from Multi-View Stereo
2024, Proceedings of SPIE - The International Society for Optical Engineering
Deep convolutional multi-informative metric correlation analysis with bottleneck attention module for face recognition in the wild
2024, Multimedia Tools and Applications
Inclusive normalization of face images to passport format
2023, arXiv
A Review of Homography Estimation: Advances and Challenges
2023, Electronics (Switzerland)

View all citing articles on Scopus

Dacheng Tao is a Professor of computer science with the Centre for Artificial Intelligence, and the Faculty of Engineering and Information Technology in the University of Technology Sydney. He mainly applies statistics and mathematics to data analytics problems and his research interests spread across computer vision, data science, image processing, machine learning, and video surveillance. His research results have expounded in one monograph and 100+ publications at prestigious journals and prominent conferences, such as IEEE TPAMI, T-NNLS, T-IP, JMLR, IJCV, NIPS, ICML, CVPR, ICCV, ECCV, AISTATS, ICDM; and ACM SIGKDD, with several best paper awards, such as the best theory/algorithm paper runner up award in IEEE ICDM07, the best student paper award in IEEE ICDM13, and the 2014 ICDM 10 Year Highest-Impact Paper Award.

View full text

Pose-invariant face recognition with homography-based normalization

Highlights

Abstract

Introduction

Section snippets

Related works

Homography-based pose normalization

HPN-based face representation

Experiments

Conclusion

Acknowledgment

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Image Vis. Comput.

Comput. Vis. Image Underst.

Pattern Recognit.

Maximum correntropy criterion for robust face recognition

IEEE Trans. Pattern Anal. Mach. Intell.

Two-stage nonnegative sparse representation for large-scale face recognition

IEEE Trans. Neural Netw. Learn. Syst.

Complementary cohort strategy for multimodal face pair matching

IEEE Trans. Inf. Forensics Secur.

Face recognition on consumer devicesreflections on replay attacks

IEEE Trans. Inf. Forensics Secur.

A comprehensive survey on pose-invariant face recognition

ACM Trans. Intell. Syst. Technol.

Pose-invariant face recognition using Markov random fields

IEEE Trans. Image Process.

Face recognition based on fitting a 3d morphable model

IEEE Trans. Pattern Anal. Mach. Intell.

Pose-invariant face recognition using a 3d deformable model

Pattern Recognit.

Multi-task pose-invariant face recognition

IEEE Trans. Image Process.

Energy normalization for pose-invariant face recognition based on mrf model image matching

IEEE Trans. Pattern Anal. Mach. Intell.