Collaborative probabilistic labels for face recognition from single sample per person

doi:10.1016/j.patcog.2016.08.007

Pattern Recognition

Volume 62, February 2017, Pages 125-134

https://doi.org/10.1016/j.patcog.2016.08.007 Get rights and content

Highlights

•
Constructed probabilistic graph propagates discrimination from generic to gallery.
•
The adaptive variation type for a given sample can be automatically estimated.
•
CPL incorporates a novel probabilistic label reconstruction based.

Abstract

Single sample per person (SSPP) recognition is one of the most challenging problems in face recognition (FR) due to the lack of information to predict the variations in the query sample. To address this problem, we propose in this paper a novel face recognition algorithm based on a robust collaborative representation (CR) and probabilistic graph model, which is called Collaborative Probabilistic Labels (CPL). First, by utilizing label propagation, we construct probabilistic labels for the samples in the generic training set corresponding to those in the gallery set, thus the discriminative information of the unlabeled data can be effectively explored in our method. Then, the adaptive variation type for a given test sample is automatically estimated. Finally, we propose a novel reconstruction-based classifier for the test sample with its corresponding adaptive dictionary and probabilistic labels. The proposed probabilistic graph based model is adaptively robust to various variations in face images, including illumination, expression, occlusion, pose, etc., and is able to reduce required training images to one sample per class. Experimental results on five widely used face databases are presented to demonstrate the efficacy of the proposed approach.

Introduction

Over the past three decades, as one of the most visible applications in computer vision, face recognition (FR) has been receiving significant attention [1] and a large number of algorithms have been proposed in recent years [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25]. Representative and popular algorithms include principal component analysis (PCA) [13], linear discriminant analysis (LDA) [2], locality preserving projections (LPP) [8], sparse representation based classification (SRC) [19] and their weighted, kernelized, and two-dimensional variants [10], [11], [15], [17], [25]. Recently, it has been proved that it is the collaborative representation (CR) [23], [24] but not the l₁-norm sparsity that makes SRC powerful for face classification. What's more, CR has significantly less complexity. However, because it is usually difficult, expensive and time-consuming to collect sufficient labeled samples, many traditional methods, including collaborative representation based classification (CRC), usually suffer from the scenario that only few labeled samples per person are available. To solve this problem, some semi-supervised learning (SSL) algorithms [43], [44], [45], [46], [47], which utilize a large number of unlabeled data to help build a better classifier from the labeled data, have been proposed in recent years.

The performance of above methods in face recognition, however, is heavily influenced by the number of training samples per person [26], [27]. Especially in many practical applications of FR (e.g., law enforcement, e-passport, surveillance, ID card identification, etc.), we can only have a single sample per person. This makes the problem of FR particularly hard since the information used for prediction is very limited while the variations in the query sample are abundant, including background illumination, pose, and facial corruption/disguise such as makeup, beard, and occlusions (glasses and scarves). Practically, this problem is called single sample per person (SSPP) classification. To address the problem of SSPP, many specially designed FR methods have been developed, which can be generally classified into three categories [26]: image partitioning, virtual sample generation and generic learning.

For the first category, the image partitioning based methods [29], [30], [31], [32], [33] always firstly partition each face image into several local patches and then apply the discriminant learning techniques for feature extraction. For example, Lu et al. proposed the Discriminative Multi-Manifold Analysis (DMMA) method [29], [30] by partitioning the image into several local patches as the feature for training, and converted the face recognition to the manifold-manifold matching problem. Although these methods have led to improved FR results, local feature extraction and discriminative learning from local patches are sensitive to image variations (e.g., extreme illumination and occlusion).

For the second category [34], [35], [36], [37], some additional training samples for each person are virtually generated such that discriminative subspace learning can be used for feature extraction. An early such attempt [34] shows that images with the same expression are located on a common subspace, which can be called the expression subspace. By projecting an image with an arbitrary expression into the expression subspaces, we can synthesize new expression images. By means of the synthesized images for individuals with one image sample, we can obtain more accurate estimation of the within-individual variability which contributes to significant improvement in recognition. But one common shortcoming among these methods is that the high-correlated virtually generalized samples may contain much redundancy and some linear transformation (e.g. expression subspaces) may not be suitable for the all face variations, namely pose and uncontrollable noise.

For the third category [38], [39], [40], [41], [42], an additional generic training set with multiple samples per person (MSPP) is applied to extract discriminative features which are subsequently used to identify the people each enrolled with a single sample. Based on the consideration that face variations for different subjects share much similarity, Yang et al. [38] learned a sparse variation dictionary by taking the relationship between the gallery set and an additional generic training set into account. The generic training set with multiple samples per person can bring new and useful information to the SSPP gallery set. And the sparse variation dictionary learning (SVDL) scheme shows state-of-the-art performance in FR with SSPP.

Although many improvements have been reported, SVDL and other generic training based methods just focus on the variations but ignore the discriminative information of the generic training set and the essential collaborative representative relationship between the gallery set and generic training set, which will be detailed described in Section 3.1. Motivated by this consideration and recent advances in CR [23], [24] that samples from other categories can also have contributions to reconstruct the query samples and in SSL [43], [44], [45], [46], [47] to construct a faithful graph utilizing both labeled and unlabeled samples, we propose in this paper a novel Collaborative Probabilistic Labels (CPL) for FR with SSPP, whose framework is illustrated in Fig. 1. The training samples in the gallery set are used to build a gallery dictionary. And the samples in the generic training set, which contain plenty of variations, are used as adaptive probabilistic label dictionaries. Utilizing group sparse coding [48], we can first seek a specific adaptive variation subset from the generic training set for each sample in the gallery set. By extracting collaborative reconstructive relationship between each sample in the gallery set and its corresponding adaptive variation subset and utilizing label propagation techniques, we can then acquire the probabilistic label matrix. At last given a test sample, we construct a novel reconstruction-based classifier with the probabilistic label matrix, the gallery set and an adaptive variation dictionary corresponding to the test sample.

Section snippets

Collaborative representation

Suppose that we have $X = [x_{1}, x_{2}, \dots, x_{n}] \in R^{m \times n}$ , where $x_{i}$ , $m$ and $n$ denote the $i$ -th sample, dimensionality and total number of training samples, respectively. And in order to collaboratively represent the query sample $y \in R^{m}$ using $X$ , we can use the regularized least square method: $(\hat{ρ}) = \arg \min_{ρ} {{‖ y - X ρ ‖}_{2}^{2} + λ {‖ ρ ‖}_{2}^{2}},$ where $λ$ is the regularization parameter. The solution of CR with regularized least square in Eq. (1) can be easily and analytically derived as $\hat{ρ} = {(X^{T} X + λ ∙ I)}^{- 1} X^{T} y,$ where $I$ is an identity matrix of size $m$

Motivation

In supervised learning, we can faultlessly classify a testing sample with a sufficient dictionary with enough labeled training samples for each category. In semi-supervised learning, the labeled samples are very limited. If we only use a few labeled data as a dictionary to represent a testing sample, the representation error may be very huge, even when the unlabeled testing sample really shares the same identity with the labeled training samples. One obvious solution to solve this problem is to

Experiments

In this section, in order to evaluate the proposed CPL method, we test the performance of our method on five widely used benchmark datasets, i.e., AR [52], CMU PIE [55], Multiple PIE [53], FERET [54] and Extended Yale-B [56], in face classification tasks. We also compare the performance of the proposed method with several state-of-the-art methods on FR with SSPP, including SVDL [38], Adaptive Generic Learning (AGL) [40], Extended SRC (ESRC) [42], Expression Subspace Projection (ESP) [34] and

Conclusion and feature work

In this paper, we developed a novel technique for FR with SSPP, which is called Collaborative Probabilistic Labels (CPL). The proposed method has more discriminating abilities than the other comparison methods. And the efficacy of CPL has been evaluated on multiple recognition tasks with several popular databases. Considering the best discriminative ability in contrast with other methods, our algorithm facilitates the effective probabilistic label learning of face space and exhibits impressive

Acknowledgments

This work is supported in part by Graduate Research and Innovation Foundation of Jiangsu Province, China under Grant KYLX15_0379, in part by the National Natural Science Foundation of China under Grants 61273251, 61401209, and 61402203, in part by the Natural Science Foundation of Jiangsu Province under Grant BK20140790, and in part by China Postdoctoral Science Foundation under Grants 2014T70525 and 2013M531364.

Hongkun Ji received the B.E. in Network Engineering from Nanjing University of Science and Technology, Nanjing, China, in 2012. He is currently a Ph.D. Candidate in the school of Computer Science and Engineering, Nanjing University of Science and Technology, China. His research interests include pattern recognition, machine learning, image processing, and computer vision.

References (56)

H. Hu
Orthogonal neighborhood preserving discriminant analysis for face recognition
Pattern Recognit.
(2008)
W. Yu et al.
Face recognition using discriminant locality preserving projections
Image Vis. Comput.
(2006)
X. Tan et al.
Face recognition from a single image per person: a survey
Pattern Recognit.
(2006)
H. Kanan et al.
Face recognition using adaptively weighted patch PZM array from a single exemplar image per person
Pattern Recognit.
(2008)
Q. Gao et al.
Face recognition using FLDA with single training image per person
Appl. Math. Comput.
(2008)
D. Zhang et al.
A new face recognition method based on SVD perturbation for single example image per person
Appl. Math. Comput.
(2005)
G.Q. Zhang et al.
Label propagation based on collaborative representation for face recognition
Neurocomputing
(2016)
R. Gross et al.
Multi-pie
Image Vis. Comput.
(2010)
W. Zhao et al.
Face recognition: a literature survey
ACM Comput. Surv.
(2003)
P.N. Belhumeur et al.
Eigenfaces vs. fisherfaces: recognition using class specific linear projection
IEEE Trans. Pattern Anal. Mach. Intell.
(1997)

D. Cai, X. He, J. Han, Spectral regression for efficient regularized subspace learning, in: Proceedings of...

D. Cai, X. He, Y. Hu, J. Han, T. Huang, Learning a spatially smooth subspace for face recognition, in: Proceedings of...

H. Chen, H. Chang, T. Liu, Local discriminant embedding and its variants, in: Proceedings of International Conference...

Y. Fu et al.

Classification and feature extraction by simplexization

IEEE Trans. Inf. Forensics Secur.

(2008)

X. He, D. Cai, S. Yan, H. Zhang, Neighborhood preserving embedding, in: Proceedings of International Conference on...

X. He et al.

Face recognition using laplacian faces

IEEE Trans. Pattern Anal. Mach. Intell.

(2005)

H. Lu et al.

MPCA: multilinear principal component analysis of tensor objects

IEEE Trans. Neural Netw.

(2008)

J. Lu et al.

A doubly weighted approach for appearance-based subspace learning methods

IEEE Trans. Inf. Forensics Secur.

(2010)

J. Lu et al.

Regularized locality preserving projections and its extensions for face recognition

IEEE Trans. Syst. Man Cybern. Part B: Cybern.

(2010)

M. Turk et al.

Eigenfaces for recognition

J. Cogn. Neurosci.

(1991)

X. Wang et al.

A unified framework for subspace face recognition

IEEE Trans. Pattern Anal. Mach. Intell.

(2004)

S. Yan et al.

A parameter-free framework for general supervised subspace learning

IEEE Trans. Inf. Forensics Secur.

(2007)

H.K. Ji et al.

Fractional-order embedding supervised canonical correlations analysis with applications to feature extraction and recognition

Neural Process. Lett.

(2016)

M.H. Yang, Kernel eigenfaces vs. kernel fisherfaces: face recognition using kernel methods, in: Proceedings of...

J. Wright et al.

Robust face recognition via sparse representation

IEEE Trans. Pattern Anal. Mach. Intell.

(2009)

K. Huang et al.

Sparse representation for signal classification

Adv. Neural Inf. Process. Syst.

(2006)

M. Davenport et al.

The smashed filter for compressive classification and target recognition

Electron. Imaging Int. Soc. Opt. Photonics

(2007)

M. Davenport et al.

Detection and Estimation with Compressive Measurements (Technical Report)

(2006)

Cited by (55)

Monogenic features based single sample face recognition by kernel sparse representation on multiple Riemannian manifolds
2022, Neurocomputing
Citation Excerpt :
However the generated virtual samples are highly correlated each other, which make the extracted features are highly redundant. The generic learning based methods [19–23] suppose that the generic training set and the gallery set share similar information in both within-class and the between-class variations, or a query sample can be treated as the sum of its prototype and the intra-subject variation. Clearly, the rich variation information in generic dataset is useful to train a feature-generating model or face recognizer.
This paper presents a novel method for single sample face recognition, characterized by the use of grayscale monogenic features of the face images, and the kernel sparse representation on multiple Riemannian manifolds. To indicate regional face discriminability, the multi-scale extended monogenic features are firstly locally extracted from different regions of an image, according to a specific face partition scheme, and then the intrinsic subspace of the feature vector set corresponding to each region is further extracted and modeled as a point of a Grassmann manifold. The congeneric local feature scheme is also exploited to entire image for the extraction of co-occurrence distributions of the grouped feature images of intersectional dimensions, based on a special binarization scheme applied to the resultant feature images. This derives the auxiliary marginal distribution-based descriptors residing on the closures of multiple multinomial manifolds. To train the kernel sparse representation classifier using the combined descriptors for a recognition task, the strategy of kernel alignment combining column L₂-norm-based kernel matrix normalization are adopted for multiple kernel fusion, where the used kernels are all derived from Riemannian geometries of two types of manifolds. The superiority of our method is demonstrated through extensive experiments.
Semi-supervised learning framework based on statistical analysis for image set classification
2020, Pattern Recognition
Citation Excerpt :
Over the recent ten years, the case of set based image classification has attracted great research interests due to its wide application in many real-life scenarios [9,10]. Compared with the traditional single image classification [11,12], an image set contains much more information within multiple images. And this classification task based set to set can dramatically tackle an extensive collection of appearance variations within images including: variations in illumination, viewpoint changes, and occlusions.
Statistical models have been widely adopted for image set classification owing to their capacity in characterizing the data distribution more flexibly and faithfully. However, these methods typically suffer from the problem that the query image set has weak statistical correlations with the training sets, which leads to larger fluctuations in performance. To address this problem, we propose a semi-supervised fuzzy discriminative learning framework based on Log-Euclidean multivariate Gaussians descriptor to facilitate more robust image set classification. Specifically, by using the semi-supervised setting which definitely has access to the labeled training data and the available unlabeled testing data, we adopt manifold distance metric to construct a “fully trusted” graph and derive two new data dependent probabilistic kernels to strongly reflect the underlying connection relationships between the training and query Gaussian manifold components. The resulted kernel representations are eventually integrated into a kernel fuzzy discriminant framework to enhance the compactness of intra-class Gaussian components and enlarge the margin for inter-class Gaussian components. Thus, more discriminating power of our learning machine is obtained for the classification of the query image set. Extensive experiments on several datasets well demonstrate the effectiveness of the proposed method compared with other image set algorithms.
Exploring the Complexities of Dissolved Organic Matter Photochemistry from the Molecular Level by Using Machine Learning Approaches
2023, Environmental Science and Technology
Deep learning based single sample face recognition: a survey
2023, Artificial Intelligence Review
DisP+V: A Unified Framework for Disentangling Prototype and Variation From Single Sample per Person
2023, IEEE Transactions on Neural Networks and Learning Systems
Real-Time Face Recognition Attendance System Based on Video Processing
2023, 2023 International Conference on Evolutionary Algorithms and Soft Computing Techniques, EASCT 2023

View all citing articles on Scopus

QuanSen Sun received the Ph.D. degree in Pattern Recognition and Intelligence System from Nanjing University of Science and Technology (NUST), China, in 2006. He is a professor in the Department of Computer Science at NUST. He visited the Department of Computer Science and Engineering, The Chinese University of Hong Kong in 2004 and 2005, respectively. His current interests include pattern recognition, image processing, remote sensing information system, medicine image analysis.

Zexuan Ji received B.E. in Computer Science and Ph.D. degrees in Pattern Recognition and Intelligence System from Nanjing University of Science and Technology, Nanjing, China, in 2007 and 2012, respectively. He is currently a postdoctoral research fellow in the School of Computer Science and Engineering at the Nanjing University of Science and Technology. His research interests include medical imaging, image processing and pattern recognition.

Yunhao Yuan received the M.Sc. degree in computer science and technology from Yangzhou University (YZU), China, in 2009, and the Ph.D. degree in pattern recognition and intelligence system from Nanjing University of Science and Technology (NUST), China, in 2013. He received two National Scholarships for Graduate Students and for Undergraduate Students from the Ministry of Education, China, an Outstanding Ph.D. thesis Award and two Top-class Scholarships from the NUST. He is currently an associate professor with the Department of Computer Science and Technology, Jiangnan University (JNU). He is the author or co-author of more than 35 scientific papers. He serves as a reviewer of several international journals such as IEEE TNNLS and IEEE TSMC: Systems. He is a member of ACM, International Society of Information Fusion (ISIF), and China Computer Federation (CCF). He is also the Artificial Intelligence Technical Committee Member and Big Data Technical Committee Member of Jiangsu Computer Society, China. His research interests include pattern recognition, image processing, computer vision, and information fusion.

Guoqing Zhang received his B.S. and Master degrees from the School of Information Engineering at the Yangzhou University in 2009 and 2012. He is currently a Ph.D. Candidate in the school of Computer Science and Engineering, Nanjing University of Science and Technology, China. His research interests include pattern recognition, machine learning, image processing, and computer vision.

View full text

Collaborative probabilistic labels for face recognition from single sample per person

Highlights

Abstract

Introduction

Section snippets

Collaborative representation

Motivation

Experiments

Conclusion and feature work

Acknowledgments

Pattern Recognit.

Image Vis. Comput.

Pattern Recognit.

Pattern Recognit.

Appl. Math. Comput.

Appl. Math. Comput.

Neurocomputing

Image Vis. Comput.

Face recognition: a literature survey

ACM Comput. Surv.

Eigenfaces vs. fisherfaces: recognition using class specific linear projection

IEEE Trans. Pattern Anal. Mach. Intell.

Classification and feature extraction by simplexization

IEEE Trans. Inf. Forensics Secur.

Face recognition using laplacian faces

IEEE Trans. Pattern Anal. Mach. Intell.

MPCA: multilinear principal component analysis of tensor objects

IEEE Trans. Neural Netw.

A doubly weighted approach for appearance-based subspace learning methods

IEEE Trans. Inf. Forensics Secur.

Regularized locality preserving projections and its extensions for face recognition

IEEE Trans. Syst. Man Cybern. Part B: Cybern.

Eigenfaces for recognition

J. Cogn. Neurosci.

A unified framework for subspace face recognition

IEEE Trans. Pattern Anal. Mach. Intell.

A parameter-free framework for general supervised subspace learning

IEEE Trans. Inf. Forensics Secur.

Fractional-order embedding supervised canonical correlations analysis with applications to feature extraction and recognition

Neural Process. Lett.

Robust face recognition via sparse representation

IEEE Trans. Pattern Anal. Mach. Intell.

Sparse representation for signal classification

Adv. Neural Inf. Process. Syst.

The smashed filter for compressive classification and target recognition

Electron. Imaging Int. Soc. Opt. Photonics

Detection and Estimation with Compressive Measurements (Technical Report)