Elsevier

Pattern Recognition

Volume 62, February 2017, Pages 125-134
Pattern Recognition

Collaborative probabilistic labels for face recognition from single sample per person

https://doi.org/10.1016/j.patcog.2016.08.007Get rights and content

Highlights

  • Constructed probabilistic graph propagates discrimination from generic to gallery.

  • The adaptive variation type for a given sample can be automatically estimated.

  • CPL incorporates a novel probabilistic label reconstruction based.

Abstract

Single sample per person (SSPP) recognition is one of the most challenging problems in face recognition (FR) due to the lack of information to predict the variations in the query sample. To address this problem, we propose in this paper a novel face recognition algorithm based on a robust collaborative representation (CR) and probabilistic graph model, which is called Collaborative Probabilistic Labels (CPL). First, by utilizing label propagation, we construct probabilistic labels for the samples in the generic training set corresponding to those in the gallery set, thus the discriminative information of the unlabeled data can be effectively explored in our method. Then, the adaptive variation type for a given test sample is automatically estimated. Finally, we propose a novel reconstruction-based classifier for the test sample with its corresponding adaptive dictionary and probabilistic labels. The proposed probabilistic graph based model is adaptively robust to various variations in face images, including illumination, expression, occlusion, pose, etc., and is able to reduce required training images to one sample per class. Experimental results on five widely used face databases are presented to demonstrate the efficacy of the proposed approach.

Introduction

Over the past three decades, as one of the most visible applications in computer vision, face recognition (FR) has been receiving significant attention [1] and a large number of algorithms have been proposed in recent years [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25]. Representative and popular algorithms include principal component analysis (PCA) [13], linear discriminant analysis (LDA) [2], locality preserving projections (LPP) [8], sparse representation based classification (SRC) [19] and their weighted, kernelized, and two-dimensional variants [10], [11], [15], [17], [25]. Recently, it has been proved that it is the collaborative representation (CR) [23], [24] but not the l1-norm sparsity that makes SRC powerful for face classification. What's more, CR has significantly less complexity. However, because it is usually difficult, expensive and time-consuming to collect sufficient labeled samples, many traditional methods, including collaborative representation based classification (CRC), usually suffer from the scenario that only few labeled samples per person are available. To solve this problem, some semi-supervised learning (SSL) algorithms [43], [44], [45], [46], [47], which utilize a large number of unlabeled data to help build a better classifier from the labeled data, have been proposed in recent years.

The performance of above methods in face recognition, however, is heavily influenced by the number of training samples per person [26], [27]. Especially in many practical applications of FR (e.g., law enforcement, e-passport, surveillance, ID card identification, etc.), we can only have a single sample per person. This makes the problem of FR particularly hard since the information used for prediction is very limited while the variations in the query sample are abundant, including background illumination, pose, and facial corruption/disguise such as makeup, beard, and occlusions (glasses and scarves). Practically, this problem is called single sample per person (SSPP) classification. To address the problem of SSPP, many specially designed FR methods have been developed, which can be generally classified into three categories [26]: image partitioning, virtual sample generation and generic learning.

For the first category, the image partitioning based methods [29], [30], [31], [32], [33] always firstly partition each face image into several local patches and then apply the discriminant learning techniques for feature extraction. For example, Lu et al. proposed the Discriminative Multi-Manifold Analysis (DMMA) method [29], [30] by partitioning the image into several local patches as the feature for training, and converted the face recognition to the manifold-manifold matching problem. Although these methods have led to improved FR results, local feature extraction and discriminative learning from local patches are sensitive to image variations (e.g., extreme illumination and occlusion).

For the second category [34], [35], [36], [37], some additional training samples for each person are virtually generated such that discriminative subspace learning can be used for feature extraction. An early such attempt [34] shows that images with the same expression are located on a common subspace, which can be called the expression subspace. By projecting an image with an arbitrary expression into the expression subspaces, we can synthesize new expression images. By means of the synthesized images for individuals with one image sample, we can obtain more accurate estimation of the within-individual variability which contributes to significant improvement in recognition. But one common shortcoming among these methods is that the high-correlated virtually generalized samples may contain much redundancy and some linear transformation (e.g. expression subspaces) may not be suitable for the all face variations, namely pose and uncontrollable noise.

For the third category [38], [39], [40], [41], [42], an additional generic training set with multiple samples per person (MSPP) is applied to extract discriminative features which are subsequently used to identify the people each enrolled with a single sample. Based on the consideration that face variations for different subjects share much similarity, Yang et al. [38] learned a sparse variation dictionary by taking the relationship between the gallery set and an additional generic training set into account. The generic training set with multiple samples per person can bring new and useful information to the SSPP gallery set. And the sparse variation dictionary learning (SVDL) scheme shows state-of-the-art performance in FR with SSPP.

Although many improvements have been reported, SVDL and other generic training based methods just focus on the variations but ignore the discriminative information of the generic training set and the essential collaborative representative relationship between the gallery set and generic training set, which will be detailed described in Section 3.1. Motivated by this consideration and recent advances in CR [23], [24] that samples from other categories can also have contributions to reconstruct the query samples and in SSL [43], [44], [45], [46], [47] to construct a faithful graph utilizing both labeled and unlabeled samples, we propose in this paper a novel Collaborative Probabilistic Labels (CPL) for FR with SSPP, whose framework is illustrated in Fig. 1. The training samples in the gallery set are used to build a gallery dictionary. And the samples in the generic training set, which contain plenty of variations, are used as adaptive probabilistic label dictionaries. Utilizing group sparse coding [48], we can first seek a specific adaptive variation subset from the generic training set for each sample in the gallery set. By extracting collaborative reconstructive relationship between each sample in the gallery set and its corresponding adaptive variation subset and utilizing label propagation techniques, we can then acquire the probabilistic label matrix. At last given a test sample, we construct a novel reconstruction-based classifier with the probabilistic label matrix, the gallery set and an adaptive variation dictionary corresponding to the test sample.

Section snippets

Collaborative representation

Suppose that we have X=[x1,x2,,xn]Rm×n, where xi, m and n denote the i-th sample, dimensionality and total number of training samples, respectively. And in order to collaboratively represent the query sample yRm using X, we can use the regularized least square method:(ρ^)=argminρ{yXρ22+λρ22},where λ is the regularization parameter. The solution of CR with regularized least square in Eq. (1) can be easily and analytically derived asρ^=(XTX+λI)1XTy,where I is an identity matrix of size m

Motivation

In supervised learning, we can faultlessly classify a testing sample with a sufficient dictionary with enough labeled training samples for each category. In semi-supervised learning, the labeled samples are very limited. If we only use a few labeled data as a dictionary to represent a testing sample, the representation error may be very huge, even when the unlabeled testing sample really shares the same identity with the labeled training samples. One obvious solution to solve this problem is to

Experiments

In this section, in order to evaluate the proposed CPL method, we test the performance of our method on five widely used benchmark datasets, i.e., AR [52], CMU PIE [55], Multiple PIE [53], FERET [54] and Extended Yale-B [56], in face classification tasks. We also compare the performance of the proposed method with several state-of-the-art methods on FR with SSPP, including SVDL [38], Adaptive Generic Learning (AGL) [40], Extended SRC (ESRC) [42], Expression Subspace Projection (ESP) [34] and

Conclusion and feature work

In this paper, we developed a novel technique for FR with SSPP, which is called Collaborative Probabilistic Labels (CPL). The proposed method has more discriminating abilities than the other comparison methods. And the efficacy of CPL has been evaluated on multiple recognition tasks with several popular databases. Considering the best discriminative ability in contrast with other methods, our algorithm facilitates the effective probabilistic label learning of face space and exhibits impressive

Acknowledgments

This work is supported in part by Graduate Research and Innovation Foundation of Jiangsu Province, China under Grant KYLX15_0379, in part by the National Natural Science Foundation of China under Grants 61273251, 61401209, and 61402203, in part by the Natural Science Foundation of Jiangsu Province under Grant BK20140790, and in part by China Postdoctoral Science Foundation under Grants 2014T70525 and 2013M531364.

Hongkun Ji received the B.E. in Network Engineering from Nanjing University of Science and Technology, Nanjing, China, in 2012. He is currently a Ph.D. Candidate in the school of Computer Science and Engineering, Nanjing University of Science and Technology, China. His research interests include pattern recognition, machine learning, image processing, and computer vision.

References (56)

  • D. Cai, X. He, J. Han, Spectral regression for efficient regularized subspace learning, in: Proceedings of...
  • D. Cai, X. He, Y. Hu, J. Han, T. Huang, Learning a spatially smooth subspace for face recognition, in: Proceedings of...
  • H. Chen, H. Chang, T. Liu, Local discriminant embedding and its variants, in: Proceedings of International Conference...
  • Y. Fu et al.

    Classification and feature extraction by simplexization

    IEEE Trans. Inf. Forensics Secur.

    (2008)
  • X. He, D. Cai, S. Yan, H. Zhang, Neighborhood preserving embedding, in: Proceedings of International Conference on...
  • X. He et al.

    Face recognition using laplacian faces

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2005)
  • H. Lu et al.

    MPCA: multilinear principal component analysis of tensor objects

    IEEE Trans. Neural Netw.

    (2008)
  • J. Lu et al.

    A doubly weighted approach for appearance-based subspace learning methods

    IEEE Trans. Inf. Forensics Secur.

    (2010)
  • J. Lu et al.

    Regularized locality preserving projections and its extensions for face recognition

    IEEE Trans. Syst. Man Cybern. Part B: Cybern.

    (2010)
  • M. Turk et al.

    Eigenfaces for recognition

    J. Cogn. Neurosci.

    (1991)
  • X. Wang et al.

    A unified framework for subspace face recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2004)
  • S. Yan et al.

    A parameter-free framework for general supervised subspace learning

    IEEE Trans. Inf. Forensics Secur.

    (2007)
  • H.K. Ji et al.

    Fractional-order embedding supervised canonical correlations analysis with applications to feature extraction and recognition

    Neural Process. Lett.

    (2016)
  • M.H. Yang, Kernel eigenfaces vs. kernel fisherfaces: face recognition using kernel methods, in: Proceedings of...
  • J. Wright et al.

    Robust face recognition via sparse representation

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2009)
  • K. Huang et al.

    Sparse representation for signal classification

    Adv. Neural Inf. Process. Syst.

    (2006)
  • M. Davenport et al.

    The smashed filter for compressive classification and target recognition

    Electron. Imaging Int. Soc. Opt. Photonics

    (2007)
  • M. Davenport et al.

    Detection and Estimation with Compressive Measurements (Technical Report)

    (2006)
  • Cited by (55)

    • Monogenic features based single sample face recognition by kernel sparse representation on multiple Riemannian manifolds

      2022, Neurocomputing
      Citation Excerpt :

      However the generated virtual samples are highly correlated each other, which make the extracted features are highly redundant. The generic learning based methods [19–23] suppose that the generic training set and the gallery set share similar information in both within-class and the between-class variations, or a query sample can be treated as the sum of its prototype and the intra-subject variation. Clearly, the rich variation information in generic dataset is useful to train a feature-generating model or face recognizer.

    • Semi-supervised learning framework based on statistical analysis for image set classification

      2020, Pattern Recognition
      Citation Excerpt :

      Over the recent ten years, the case of set based image classification has attracted great research interests due to its wide application in many real-life scenarios [9,10]. Compared with the traditional single image classification [11,12], an image set contains much more information within multiple images. And this classification task based set to set can dramatically tackle an extensive collection of appearance variations within images including: variations in illumination, viewpoint changes, and occlusions.

    • Real-Time Face Recognition Attendance System Based on Video Processing

      2023, 2023 International Conference on Evolutionary Algorithms and Soft Computing Techniques, EASCT 2023
    View all citing articles on Scopus

    Hongkun Ji received the B.E. in Network Engineering from Nanjing University of Science and Technology, Nanjing, China, in 2012. He is currently a Ph.D. Candidate in the school of Computer Science and Engineering, Nanjing University of Science and Technology, China. His research interests include pattern recognition, machine learning, image processing, and computer vision.

    QuanSen Sun received the Ph.D. degree in Pattern Recognition and Intelligence System from Nanjing University of Science and Technology (NUST), China, in 2006. He is a professor in the Department of Computer Science at NUST. He visited the Department of Computer Science and Engineering, The Chinese University of Hong Kong in 2004 and 2005, respectively. His current interests include pattern recognition, image processing, remote sensing information system, medicine image analysis.

    Zexuan Ji received B.E. in Computer Science and Ph.D. degrees in Pattern Recognition and Intelligence System from Nanjing University of Science and Technology, Nanjing, China, in 2007 and 2012, respectively. He is currently a postdoctoral research fellow in the School of Computer Science and Engineering at the Nanjing University of Science and Technology. His research interests include medical imaging, image processing and pattern recognition.

    Yunhao Yuan received the M.Sc. degree in computer science and technology from Yangzhou University (YZU), China, in 2009, and the Ph.D. degree in pattern recognition and intelligence system from Nanjing University of Science and Technology (NUST), China, in 2013. He received two National Scholarships for Graduate Students and for Undergraduate Students from the Ministry of Education, China, an Outstanding Ph.D. thesis Award and two Top-class Scholarships from the NUST. He is currently an associate professor with the Department of Computer Science and Technology, Jiangnan University (JNU). He is the author or co-author of more than 35 scientific papers. He serves as a reviewer of several international journals such as IEEE TNNLS and IEEE TSMC: Systems. He is a member of ACM, International Society of Information Fusion (ISIF), and China Computer Federation (CCF). He is also the Artificial Intelligence Technical Committee Member and Big Data Technical Committee Member of Jiangsu Computer Society, China. His research interests include pattern recognition, image processing, computer vision, and information fusion.

    Guoqing Zhang received his B.S. and Master degrees from the School of Information Engineering at the Yangzhou University in 2009 and 2012. He is currently a Ph.D. Candidate in the school of Computer Science and Engineering, Nanjing University of Science and Technology, China. His research interests include pattern recognition, machine learning, image processing, and computer vision.

    View full text