Abstract
Face recognition is still a challenging issue due to the presence of intrinsic complexity, external variations and number limitation of training samples. In this paper, a novel face recognition method based on probabilistic latent semantic analysis (pLSA) model is developed, which mainly contains two stages: bag-of-words features extraction and semantic representation learning. In the first stage, to extract more structure information, the region-specific dictionary strategy is employed, i.e., generating a dictionary for each region. The encoded and sum-pooled features of all regions are concatenated together. In the second stage, a discriminative pLSA (DpLSA) model is presented, which initializes the word-topic distribution \(P(w|z_k)\) by the center point of the training data from category k. As a result, the problem of how to choose appropriate number of topics in classical topic model is alleviated, and the training process of DpLSA is very fast only requiring few iterations. Moreover, the discovered topic-document distribution \(P\left( z|d\right) \) is discriminative and semantic with the dominant topic entry corresponds to the category label of image d, which enables performing classification by \(P\left( z|d\right) \) directly. Extensive experiments on four representative databases demonstrate that the proposed DpLSA is effective for face recognition under single training sample and possesses a certain degree of robustness to illumination, pose, as well as occlusion.
Similar content being viewed by others
Notes
\({\mathcal {R}}\equiv conv(P(\cdot |z_1)\),\(P(\cdot |z_2)\),\(P(\cdot |z_3))\).
Code kindly provided at: http://www4.comp.polyu.edu.hk/~cslzhang/papers.htm.
Code kindly provided at: http://www.cad.zju.edu.cn/home/dengcai/Data/Metric.html.
Code kindly provided at: http://bmc.uestc.edu.cn/~fshen/.
Code kindly provided at: http://mx.nthu.edu.tw/~tsunghan/Source%20codes.html.
Following the original setting, whitening PCA (WPCA) is applied to reduce the dimension of PCANet features on FERET and LFW databases.
Since the image size of FERET used in this paper is \(80 \times 80\) (not \(150 \times 90\)), so we fine-tune the model parameters of PCANet by varying \(k_1\) and \(k_2\) from 3 to 13 with step 2, hist block size from 6 to 20 with step 2, keeping \(L_1=L_2=8\). Then the optimal parameters are selected.
Code kindly provided at: http://www.ifp.illinois.edu/~jyang29/ScSPM.htm.
References
Zhao W, Chellappa R, Phillips P, Rosenfeld A (2003) Face recognition: a literature survey. ACM Comput Surv 35(4):399–458
Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86
Belhumeur P, Hespanha J, Kriegman D (1997) Eigenfaces versus fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(1):711–720
He X, Yan S, Hu Y, Niyogi P, Zhang H (2005) Face recognition using laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340
Ahonen T, Hadid A, Pietikäinen M (2006) Face decription with local binary patterns: application to face recognition. IEEE Trans Pattern Anal Mach Intell 28(12):2037–2041
Liu C, Wechsler H (2002) Gabor feature based classification using the enhanced Fisher linear discriminant model for face recognition. IEEE Trans Image Process 11(4):467–476
Li L, Ge H, Tong Y, Zhang Y (2017) Face recognition using gabor-based feature extraction and feature space transformation fusion method for single image per person problem. Neural Process Lett. https://doi.org/10.1007/s11063-017-9693-4
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 20–25
Wang S, Chen H, Yan W, Chen Y, Fu X (2014) Face recognition and micro-expression recognition based on discriminant tensor subspace analysis plus extreme learning machine. Neural Process Lett 39:25–43
Ding C, Xu C, Tao D (2015) Multi-task pose-invariant face recognition. IEEE Trans Image Process 24(3):980–993
Ding C, Tao D (2016) A comprehensive survey on pose-invariant face recognition. ACM Trans Intell Syst Technol 7(3):37
Ding C, Tao D (2017) Pose-invariant face recognition with homography-based normalization. Pattern Recogn 66:144–152
Wright J, Yang A, Ganesh A, Sastry S, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation: Which helps face recognition? In: Proceedings of IEEE international conference on computer vision, pp 471–478
Jin T, Liu Z, Yu Z, Min X, Li L (2017) Locality preserving collaborative representation for face recognition. Neural Process Lett 45:967–979
Yang M, Zhang L, Yang J, Zhang D (2011) Robust sparse coding for face recognition. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 625–632
Yang M, Van Gool L, Zhang L (2013) Sparse variation dictionary learning for face recognition with a single training sample per person. In: Proceedings of IEEE international conference on computer vision, pp 689–696
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42:177–196
Li F, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 524–531
Bosch A, Zisserman A, Munoz X (2006) Scene classification via pLSA. In: Proceedings of European conference on computer vision, pp 517–530
Cao L, Li F (2007) Spatially coherent latent topic model for concurrent segmentation and classification of objects and scenes. In: Proceedings of IEEE international conference on computer vision, pp 1–8
Sivic J, Russell B, Efros A, Zisserman A, Freeman W (2005) Discovering objects and their location in images. In: Proceedings of IEEE international conference on computer vision, pp 370–377
Dempster A, Laird N, Rubin D (1997) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39(1):1–38
Lovato P, Bicego M, MurinoV, Perina A (2015) Robust initialization for learning latent dirichlet allocation. In: International workshop on similarity-based pattern recognition, pp 117–132
Wang Y, Mori G (2009) Human action recognition by semilatent topic models. IEEE Trans Pattern Anal Mach Intell 31(10):1762–1774
Lu Z, Peng Y, Ip H (2010) Image categorization via robust pLSA. Pattern Recogn Lett 31:36–43
Cui Z, Li W, Xu D, Shan S, Chen X (2013) Fusing robust face region descriptors via multiple metric learning for face recognition in the wild. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 3554–3561
Cui Z, Shan S, Wang R, Zhang L, Chen X (2015) Sparsely encoded local descriptor for face verification. Neurocomputing 147:403–411
Lu J, Liong V, Wang G, Moulin P (2015) Joint feature learning for face recognition. IEEE Trans Inf Forensics Secur 10:1371–1383
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 2169–2178
Li Z, Imai J, Kaneko M (2010) Robust face recognition using block-based bag of words. In: Proceedings of IEEE international conference on pattern recognition, pp 1285–1288
Chan T, Jia K, Gao S, Lu J, Zeng Z, Ma Y (2015) PCANet: A simple deep learning baseline for image classification? IEEE Trans Image Process 24(12):5017–5032
Wang Y, Xu C, You S, Xu C, Tao D (2017) DCT regularized extreme visual recovery. IEEE Trans Image Process 26(7):3360–3371
Li J, Xu C, Yang W, Sun C (2017) SPA: spatially pooled attributes for image retrieval. Neurocomputing 257:47–58
Sydorov V, Sakurada M, Lampert C (2014) Deep fisher kernels-end to end learning of the fisher kernel gmm parameters. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1402–1409
Cao Z, Yin Q, Tang X, Sun J (2010) Face recognition with learning-based descriptor. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 2707–2714
Shen F, Yang Y, Zhou X, Liu X, Shao J (2016) Face identification with second-order pooling in single-layer networks. Neurocomputing 187:11–18
Shen F, Shen C, Zhou X, Yang Y, Shen H (2016) Face image classification by pooling raw features. Pattern Recogn 54:94–103
Zhu P, Zhang L, Hu Q, Shiu S (2012) Multi-scale patch based collaborative representation for face recognition with margin distribution optimization. In: Proceedings of European conference on computer vision, pp 822–835
Zhang Y, Tan X (2009) Face recognition via spatial-pLSA. In: Proceedings of Chinese conference on pattern recognition, pp 518–522
Jurie J, Triggs B (2005) Creating efficient codebooks for visual recognition. In: Proceedings of IEEE international conference on computer vision, pp 604–610
Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1794–1801
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 3360–3367
Vedaldi A, Fulkerson B (2010) VLFeat: an open and portable library of computer vision algorithms. In: International conference on multimedia. ACM, pp 1469–1472
Ding C, Choi J, Tao D, Davis L (2016) Multi-directional multi-level dual-cross patterns for robust face recognition. IEEE Trans Pattern Anal Mach Intell 38(3):518–531
Vu N, Caplier A (2011) Enhanced patterns of oriented edge magnitudes for face recognition and image matching. IEEE Trans Image Process 21(3):1352–1365
Kannala J, Rahtu E (2012) BSIF: binarized statistical image features. In: Proceedings of international conference on pattern recognition, pp 1363–1366
Aharon M, Elad M, Bruckstein A (2006) K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Signal Process 54(1):4311–4322
Huang Y, Wu Z, Wang L, Tan T (2014) Feature coding in image classification: a comprehensive study. IEEE Trans Pattern Anal Mach Intell 36(3):493–506
Yu K, Zhang T, Gong Y (2009) Nonlinear learning using local coordinate coding. In: Proceedings of the advances in neural information processing systems, pp 2223–2231
Martínez A, Benavente R (1998) The AR face database. Technical report
Georghiades A, Belhumeur P, Kriegman D (2001) From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans Pattern Anal Mach Intell 23(6):643–660
Phillips P, Wechsler H, Huang J, Rauss P (1998) The FERET database and evaluation procedure for face-recognition algorithms. Image Vis Comput 16(5):295–306
Huang GB, Ramesh M, Berg T, Learned-Miller E (2007) Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical report 07-49, University of Massachusetts, Amherst
Wolf L, Hassner T, Taigman Y(2009) Similarity scores based on background samples. In: Proceedings of Asian conference on computer vision, pp 88–97
Xu B, Bu J, Chen C, Wang C, Cai D, He X (2015) EMR: a scalable graph-based ranking model for content-based image retrieval. IEEE Trans Knowl Data Eng 27(1):102–114
Zhu P, Yang M, Zhang L, Lee IY (2014) Local generic representation for face recognition with single sample per person. In: Proceedings of Asian conference on computer vision, pp 34–50
Chen S, Liu J, Zhou Z (2004) Making flda applicable to face recognition with one sample per person. Pattern Recognit 37(7):1553–1555
Lu J, Tan Y, Wang G (2013) Discriminative multimanifold analysis for face recognition from a single training sample per person. IEEE Trans Pattern Anal Mach Intell 35(1):39–51
Yang L (2007) The connection between manifold learning and distance metric learning. Technical report
Sun Y, Wang X, Tang X (2014) Deep learning face representation from predicting 10,000 classes. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1891–1898
Taigman Y, Yang M, Ranzato M, Wolf L (2014) DeepFace: closing the gap to human-level performance in face verification. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1701–1708
Ding C, Tao D (2015) Robust face recognition via multimodal deep face representation. IEEE Trans Multimed 17(11):2049–2058
Ding C, Tao D (2017) Trunk-branch ensemble convolutional neural networks for video-based face recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2017.2700390
Zhang P, You X, Ou W, Chen C, Cheung Y (2016) Sparse discriminative multimanifold embedding for one-sample face identification. Pattern Recognit 52:249–259
Fan R, Chang K, Hsieh C, Wang X, Lin C (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874
Acknowledgements
The authors would like to thank the anonymous reviewers whose valuable comments and suggestions greatly improve this paper. The work described in this paper was partially supported by the National Natural Science Foundation of China (Grant Nos. 61772093, 61402062, 61602068), Program for Changjiang Scholars and Innovative Research Team in University (Grant No. IRT1196), Chongqing Research Program of Basic Science & Frontier Technology (Grant No. cstc2015jcyjA40037).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhou, D., Yang, D., Zhang, X. et al. Discriminative Probabilistic Latent Semantic Analysis with Application to Single Sample Face Recognition. Neural Process Lett 49, 1273–1298 (2019). https://doi.org/10.1007/s11063-018-9852-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-018-9852-2