Elsevier

Pattern Recognition

Volume 52, April 2016, Pages 218-237
Pattern Recognition

A Two-Phase Weighted Collaborative Representation for 3D partial face recognition with single sample

https://doi.org/10.1016/j.patcog.2015.09.035Get rights and content

Highlights

  • Novel Keypoint-based Multiple Triangle Statistics (KMTS) are proposed for 3D face representation.

  • The proposed local descriptor is robust to partial facial data and expression/pose variations.

  • A Two-Phase Weighted Collaborative Representation Classification (TPWCRC) framework is used to perform face recognition.

  • The proposed classification framework can effectively address the single sample problem.

  • State-of-the-art performance on six challenging datasets with high efficiency is achieved.

Abstract

3D face recognition with the availability of only partial data (missing parts, occlusions and data corruptions) and single training sample is a highly challenging task. This paper presents an efficient 3D face recognition approach to address this challenge. We represent a facial scan with a set of local Keypoint-based Multiple Triangle Statistics (KMTS), which is robust to partial facial data, large facial expressions and pose variations. To address the single sample problem, we then propose a Two-Phase Weighted Collaborative Representation Classification (TPWCRC) framework. A class-based probability estimation is first calculated based on the extracted local descriptors as a prior knowledge. The resulting class-based probability estimation is then incorporated into the proposed classification framework as a locality constraint to further enhance its discriminating power. Experimental results on six challenging 3D facial datasets show that the proposed KMTS–TPWCRC framework achieves promising results for human face recognition with missing parts, occlusions, data corruptions, expressions and pose variations.

Introduction

Face recognition (FR) is an active research area in the computer vision community due to its non-intrusive and friendly acquisition nature when compared to other biometrics. The task of FR is to identify or verify a human face from its records (2D and 3D modalities). FR has a number of real-world applications including access control and video surveillance. Among various facial modalities, considerable progress has been made with 2D FR. However, its performance is still challenged by pose changes and illumination variations. With the rapid development of 3D acquisition technologies [1], [2], [3], [4], [5], [6], [7], 3D FR has drawn growing attention due to its potential capability to overcome the inherent limitations of its 2D counterpart. Most of the existing 3D FR approaches are proposed for facial scans acquired in a highly controlled environments. Such facial scans are mostly frontal and good in quality. Promising recognition performance has been reported on those high quality 3D facial data by existing 3D FR approaches. However, for many real-world applications, facial scans can only be acquired in uncontrolled environments. Such facial scans usually contain a partial face in the presence of missing parts, occlusions and data corruptions (as shown in Fig. 1). FR from such partial facial scans is still an open issue, and further investigation is required to enable fully automatic 3D Partial Face Recognition (PFR). Although a few 3D PFR approaches have been proposed, the recent release of datasets containing partial facial scans (e.g., the Bosphorus [8], GavabDB [9] and UMB-DB [10] datasets) provides a large benchmark for 3D PFR and has further boosted the research on this topic. Furthermore, most of the existing 2D/3D FR approaches require a sufficiently large set of training samples per individual to cover possible facial variations (e.g., partial data and expressions) for accurate FR. However, in many real-world FR applications, only a single sample per individual can be provided for training, resulting in the single sample based FR problem. Partial facial data and the limitation in the number of training samples per individual are therefore two important challenges for real-world 3D PFR applications.

The existing 3D FR approaches can be coarsely classified into two categories: global descriptor based and local descriptor based approaches. The global descriptor based approaches extract features from the entire face to encode the geometric characteristics of a 3D face [5], [7]. However, these approaches only work when the complete 3D face is available. These approaches therefore rely on the availability of complete facial scan and are sensitive to missing parts, occlusions and data corruptions. In contrast, local descriptor based approaches usually extract features from the neighborhoods of the detected keypoints, resulting in a set of coordinate-independent local descriptors. Compared to global descriptor based approaches, the number of extracted local descriptors is related to the content of input face (holistic or partial). Therefore, such flexible facial representation is more suitable for 3D PFR. Furthermore, it is commonly believed that only a few facial regions are significantly affected by distortions caused by missing parts, occlusions and data corruptions, while most of the other regions remain invariant [11], [12]. This indicates that local facial descriptors are more suitable and powerful when dealing with partial facial data.

Recently, Sparse Representation (SR) has become a powerful method for various pattern recognition tasks. Experimental results on FR show that the Sparse Representation-based Classification (SRC) algorithms outperform a number of conventional FR algorithms. However, solving l1-norm based sparsity usually requires computationally demanding optimization procedures. It is argued that the l1-norm sparsity is not essential for the improvement of FR performance [13]. Several more efficient algorithms have therefore been proposed, including the Collaborative Representation-based Classification (CRC), which can achieve similar recognition performance compared to SRC [13], [14]. However, both SRC and CRC require multiple training samples for each individual to cope with the test input with complex variations. In the case of single sample problem, the representation becomes unreliable due to the lack of training samples. Inspired by this fact, more prior classification knowledge can be incorporated into the data representation to perform single sample based FR.

In this paper, we propose a Two-Phase Weighted Collaborative Representation Classification (TPWCRC) for single sample based 3D PFR. First, an efficient Keypoint-based Multiple Triangle Statistics (KMTS) descriptor is presented for facial scan representation (Section 3). The proposed KMTS descriptor exhibits both high discriminative power and robustness under various deformations (i.e., partial data, expressions and pose variations). Second, a class-based probability estimation algorithm is presented to provide a strong prior classification knowledge for any input test during the first phase of classification. The probability estimation can then be used to compensate for the unavailability of multiple training samples per individual. The resulting probability estimation is therefore integrated into the TPWCRC framework as a locality constraint to restrain the globality of the data representation of the input test. The final classification result then corresponds to the individual which gives the smallest residuals (Section 4). The proposed KMTS–TPWCRC 3D PFR approach is alignment-free and no manual intervention is required. Its efficiency and robustness has been extensively demonstrated by a set of experiments on six popular 3D face datasets (Section 5).

Furthermore, the proposed approach only involves simple computations, including nosetip detection, keypoint detection, local geometrical descriptor and a TPWCRC framework which can be derived analytically. For the considerations of developing an efficient 3D PFR system, simple geometric features are extracted without any complex mathematical operations. Unlike many existing 3D PFR approaches which rely on complicated and time-consuming feature extraction methods (e.g., [15], [16]) or occlusion data restoration method (e.g., [15], [17]), the proposed approach is very efficient (see Section 5.5). Its efficiency can further be improved by performing these simple computations in parallel. Although some existing methods achieve better FR performance, the proposed approach can achieve a comparable performance at much lower computational cost. This suggests that our proposed approach is more suitable for practical applications.

The rest of this paper is organized as follows. Section 2 provides a brief literature review of closely related 3D FR approaches. Section 3 describes the proposed KMTS facial descriptors. Section 4 introduces the proposed TPWCRC framework for single sample based 3D PFR. Section 5 presents the experimental results and comparative analysis on five popular 3D facial datasets. Section 6 concludes the paper.

Section snippets

Related work and our contributions

FR has been well studied and several survey papers can be found in [18], [19]. In this section, we first briefly summarize the existing literature which is most relevant to our approach. Specifically, we will restrain our review to local descriptor based 3D FR approaches (Section 2.1), 3D PFR approaches (Section 2.2), and SRC based facial analysis approaches (Section 2.3). We will then give an overview of the proposed approach in Section 2.4.

Motivation of the proposal of KMTS

Most of the existing 3D FR approaches rely on holistic facial descriptors with a predefined dimensionality. However, in the case of 3D PFR, a facial scan may suffer from missing parts, occlusions and data corruptions and it is not always feasible to extract fixed-length holistic feature descriptors. Therefore, in this work, we propose a local descriptor for 3D PFR based on our previous work [11], [20]. Our descriptor is specifically tailored for 3D PFR with the following considerations. First,

Motivation of TPWCRC

SR has become a powerful technique to address many pattern recognition and computer vision problems [36]. SR assumes that each data point yRm can be encoded as a sparse linear combination of other points from a dictionary. That is, y=Dx, where D is a dictionary of training samples, and x is the representation coding of y over D. It is required that most entries in x are zeros. This can be calculated by solving the following l1-minimization problem:x^=argminxDxy22+λx1where λ0 is a scalar

Experimental results

We tested our proposed KMTS-TPWCRC 3D PFR framework on six challenging 3D face datasets including Bosphorus, GavabDB, UMB-DB, SHREC 2008, BU-3DFE and FRGC v2.0.

In this section, we first introduce our 3D data preprocessing in Section 5.1. We then present the results of our proposed approach on partial facial datasets in Section 5.2. Comparative results of our approach with respect to large expression deformations are presented in Section 5.3. Comparative results of different classifiers and the

Conclusions

In this paper, we proposed an automatic 3D PFR approach using single sample input. The proposed approach represents a 3D face with a set of local geometrical descriptors called KMTS. A TPWCRC framework is proposed to address the single training sample based 3D PFR problem. During the first phase, a class-based probability estimation is computed for each probe scan as a prior classification knowledge. Then, the resulting class-based probability estimation is incorporated into a weighted

Conflict of interest

None declared.

Acknowledgment

This work is supported by the Natural Science Foundation of China under Grant nos. 61403265 and 61471371. This work is also supported by the Science and Technology Plan of Sichuan Province under Grant no. 2015SZ0226.

Yinjie Lei received his M.S. degree from Sichuan University (SCU), China, in the area of image processing, and the Ph.D. degree in Computer Vision from University of Western Australia (UWA), Australia. He is currently an assistant professor at Sichuan University, Chengdu, China. His research interests include image and text understanding, 3D face processing and recognition, 3D modeling, machine learning and statistical pattern recognition.

References (72)

  • Y. Guo et al.

    Rotational projection statistics for 3D local surface description and object recognition

    Int. J. Comput. Vis.

    (2013)
  • I. Kakadiaris et al.

    Three-dimensional face recognition in the presence of facial expressionan annotated deformable model approach

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • Y. Guo et al.

    A novel local surface feature for 3D object recognition under clutter and occlusion

    Inf. Sci.

    (2014)
  • Y. Wang et al.

    Robust 3D face recognition by local shape difference boosting

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2010)
  • Y. Guo et al.

    An accurate and robust range image registration algorithm for 3D object modeling

    IEEE Trans. Multimed.

    (2013)
  • H. Mohammadzade et al.

    Iterative closest normal point for 3D face recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2012)
  • A. Savran, N. Alyüz, H. Dibeklioğlu, O. Çeliktutan, B. Gökberk, B. Sankur, L. Akarun, Bosphorus database for 3D face...
  • A. Moreno, A. Sanchez, GavabDB: a 3D face database, in: Proceedings of Workshop on Biometrics on the Internet, 2004,...
  • A. Colombo et al.

    Three-dimensional occlusion detection and restoration of partially occluded faces

    J. Math. Imaging Vis.

    (2011)
  • L. Zhang, M. Yang, X. Feng, Sparse representation or collaborative representation: Which helps face recognition?, in:...
  • H. Drira et al.

    3D face recognition under expressions, occlusions and pose variations

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2013)
  • H. Li et al.

    Towards 3D face recognition in the reala registration-free approach using fine-grained matching of 3D keypoint descriptors

    Int. J. Comput. Vis.

    (2014)
  • N. Alyuz et al.

    3-D face recognition under occlusion using masked projection

    IEEE Trans. Inf. Forensics Secur.

    (2013)
  • W. Zhao et al.

    Face recognitiona literature survey

    ACM Comput. Surv.

    (2003)
  • A. Mian et al.

    Keypoint detection and local feature matching for textured 3D face recognition

    Int. J. Comput. Vis.

    (2007)
  • C. Faltemier et al.

    A region ensemble for 3D face recognition

    IEEE Trans. Inf. Forensics Secur.

    (2008)
  • K. Chang et al.

    Multiple nose region matching for 3D face recognition under varying facial expression

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2006)
  • S. Gupta et al.

    Anthropometric 3D face recognition

    Int. J. Comput. Vis.

    (2010)
  • X. Li, T. Jia, H. Zhang, Expression-insensitive 3D face recognition using sparse representation, in: IEEE Conference on...
  • L. Ballihi, B. Ben Amor, M. Daoudi, A. Srivastava, D. Aboutajdine, Selecting 3D curves on the nasal surface using...
  • S. Berretti et al.

    3D face recognition using isogeodesic stripes

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2010)
  • C. Queirolo et al.

    3D face recognition using simulated annealing and the surface interpenetration measure

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2010)
  • N. Alyuz, B. Gokberk, L. Akarun, A 3D face recognition system for expression and occlusion invariance, in: 2nd IEEE...
  • S. Berretti et al.

    Selecting stable keypoints and local descriptors for person identification using 3D face scans

    Vis. Comput.

    (2014)
  • A. Colombo, C. Cusano, R. Schettini, Detection and restoration of occlusions for 3D face recognition, in: 2006 IEEE...
  • A. Colombo et al.

    Gappy PCA classification for occlusion tolerant 3D face detection

    J. Math. Imaging Vis.

    (2009)
  • Cited by (97)

    • A comprehensive survey on 3D face recognition methods

      2022, Engineering Applications of Artificial Intelligence
    View all citing articles on Scopus

    Yinjie Lei received his M.S. degree from Sichuan University (SCU), China, in the area of image processing, and the Ph.D. degree in Computer Vision from University of Western Australia (UWA), Australia. He is currently an assistant professor at Sichuan University, Chengdu, China. His research interests include image and text understanding, 3D face processing and recognition, 3D modeling, machine learning and statistical pattern recognition.

    Yulan Guo received his B.E. and Ph.D. degrees from National University of Defense Technology (NUDT) in 2008 and 2015, respectively. He is currently an assistant professor at NUDT. He was a visiting (joint) PhD student at the University of Western Australia from November 2011 to November 2014. He authored more than 20 peer reviewed journal and conference publications (including IEEE TPAMI, IJCV, PR and IEEE TMM) and one book chapter. He served as a reviewer for more than 10 international journals and several conferences. His research interests include 3D feature extraction, 3D modeling, 3D object recognition, and 3D face recognition.

    Munawar Hayat received his Bachelor of Engineering degree from National University of Science and Technology (NUST) in 2009. Later, he was awarded Erasmus Mundus Scholarship for a joint European Master׳s degree program. He completed his PhD in 2015 from The University of Western Australia (UWA) sponsored by the Scholarship for International Research Fees (SIRF). He is currently a postdoctoral researcher at IBM Research Lab in Melbourne, Australia. His research interests include computer vision, signal and image processing, pattern recognition and machine learning.

    Mohammed Bennamoun received the M.S. degree in Control Theory from Queen׳s University, Kingston, Canada, and the Ph.D. degree in Computer Vision from Queen׳s/QUT, Brisbane, Australia. He is currently a winthrop professor at the University of Western Australia, Crawley, Australia. His research interests include control theory, robotics, obstacle avoidance, face/object recognition, artificial neural networks, signal/image processing, and computer vision. He published more than 150 journal and conference publications.

    Xinzhi Zhou received the B.S. and M.S. degrees from Chongqing University, Chongqing, China, and the Ph.D. degree from Sichuan University, Sichuan, China, in 1988, 1991, and 2003, respectively. He is a professor with the School of Electronics and Information Engineering, Sichuan University. His current research interests include new sensing technology, intelligent system, and intelligent information processing.

    View full text