Abstract
With the abundance of video data, the interest in more effective methods for recognizing faces from surveillance videos has grown. However, most algorithms proposed in this field have an assumption that each image set lies in a single linear subspace, or a mixture of linear subspaces. As a result, 3-dimensional shape information, which leads to the nonlinear transformation of face images, is ignored. This paper proposes a robust video face recognition across pose variation in video (RVPose) based on sparse representation. The key idea is performing alignment and recognition based on sparse representation simultaneously. Moreover, by considering that multi-pose faces of the same subject possess the same texture and 3-dimensional shape, RVPose aligns a sequence of faces with pose variations simultaneously, which is reduced to a 3-dimensional shape-constrained video alignment problem. Finally, aligned video sequence is recognized based on sparse represent. Experiments conducted on public video datasets demonstrate the effectiveness of the proposed algorithm.
Similar content being viewed by others
References
Sirovich L, Kirby M (1987) Low-dimensional procedure for the characterization of human faces. J Opt Soc Am 4(3):519–524
Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Lawrence S, Giles C, Tsoi AC, Back A (1997) Face recognition: a convolutional neural-network approach. IEEE Trans Neural Netw 8(1):98–113
Wiskott L, Fellous JM, Kuiger N, von der Malsburg C (1997) Face recognition by elastic bunch graph matching. IEEE Trans Pattern Anal Mach Intell 19(7):775–779
Zhang X, Gao Y (2009) Face recognition across pose: a review. Pattern Recogn 42(11):2876–2896
Murphy-Chutorian E, Trivedi MM (2009) Head pose estimation in computer vision: a survey. IEEE Trans Pattern Anal Mach Intell 31(4):607–626
Ashraf AB, Lucey S, Chen T (2008) Learning patch correspondences for improved viewpoint invariant face recognition. In: IEEE conference on computer vision and pattern recognition, pp 1–8
Chai X, Shan S, Chen X, Gao W (2007) Locally linear regression for pose-invariant face recognition. IEEE Trans Image Process 16(7):1716–1725
Li A, Shan S, Gao W (2012) Coupled bias-variance tradeoff for cross-pose face recognition. IEEE Trans Image Process 21(1):305–315
Ho HT, Chellappa R (2013) Pose-invariant face recognition using markov random fields. IEEE Trans Image Process 22(4):1573–1584
Li S, Liu X, Chai X, Zhang H, Lao S, Shan S (2012) Morphable displacement field based image matching for face recognition across pose. In: Fitzgibbon A, Lazebnik S, Perona P, Sato Y, Schmid C (eds) European conference on computer vision, pp 102–115
Wang N, Tao D, Gao X, Li X, Li J (2014) A comprehensive survey to face hallucination. Int J Comput Vis 106(1):9–30
Blanz V, Vetter T (2003) Face recognition based on fitting a 3d morphable model. IEEE Trans Pattern Anal Mach Intell 25(9):1063–1074
Prabhu U, Heo J, Savvides M (2011) Unconstrained pose-invariant face recognition using 3d generic elastic models. IEEE Trans Pattern Anal Mach Intell 33(10):1952–1961
Asthana A, Marks TK, Jones MJ, Tieu KH, Rohith M (2011) Fully automatic pose-invariant face recognition via 3D pose normalization. In: International conference on computer vision, pp 937–944
Wright J, Hua G (2009) Implicit elastic matching with random projections for pose-variant face recognition. In: IEEE conference on computer vision and pattern recognition, pp 1502–1509
Lu J, Liong VE, Zhou X, Zhou J (2015) Learning compact binary face descriptor for face recognition. IEEE Trans Pattern Anal Mach Intell 37(10):2041–2056
Zhang Y, Shao M, Wong EK, Fu Y (2013) Random faces guided sparse many-to-one encoder for pose-invariant face recognition. In: IEEE International conference on computer vision, pp 2416–2423
Yi D, Lei Z, Li SZ (2013) Towards pose robust face recognition. In: Computer vision and pattern recognition
Annan L, Shiguang S, Xilin C, Wen G (2009) Maximizing intra-individual correlations for face recognition across pose differences. In: Computer vision and pattern recognition, pp 605–611
Sharma A, Jacobs DW (2011) Bypassing synthesis: Pls for face recognition with pose, low-resolution and sketch. In: Computer vision and pattern recognition, pp 593–600
Sharma A, Kumar A, Daume H, Jacobs D (2012) Generalized multiview analysis: a discriminative latent space. In: IEEE conference on computer vision and pattern recognition, pp 2160–2167
Kan M, Shan S, Zhang H, Lao S, Chen X (2012) Multi-view discriminant analysis. In: European conference on computer vision, pp 808–821
Mudunuri SP, Biswas S (2016) Low resolution face recognition across variations in pose and illumination. IEEE Trans Pattern Anal Mach Intell 38(5):1034–1040
Ruiping W, Shiguang S, Xilin C, Wen G (2008) Manifold-manifold distance with application to face recognition based on image set. In: IEEE conference on computer vision and pattern recognition, pp 1–8
Wang R, Chen X (2009) Manifold discriminant analysis. In: IEEE conference on computer vision and pattern recognition, pp 429–436
Lui YM, Beveridge JR (2008) Grassmann registration manifolds for face recognition. In: European conference on computer vision, pp 44–57
Shigenaka R, Raytchev B, Tamaki T, Kaneda K (2012) Face sequence recognition using Grassmann distances and Grassmann kernels. In: International joint conference on neural networks, pp 1–7
Arandjelovic O, Cipolla R (2004) An illumination invariant face recognition system for access control using video. In: British machine vision conference
Arandjelovi O (2012) Computationally efficient application of the generic shape-illumination invariant to face recognition from video. Pattern Recogn 45(1):92–103
Taigman Y, Yang M, Ranzato M, Wolf L (2014) Deepface: closing the gap to human-level performance in face verification. In: Computer vision and pattern recognition, pp 1701–1708
Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670
Hong C, Yu J, Tao D, Wang M (2015) Image-based three-dimensional human pose recovery by multiview locality-sensitive sparse retrieval. IEEE Trans Ind Electron 62(6):3742–3751
Zhu Z, Luo P, Wang X, Tang X (2013) Deep learning identity-preserving face space. In: International conference on computer vision, pp 113–120
Taigman Y, Yang M, Ranzato M, Wolf L (2015) Web-scale training for face identification. In: Computer vision and pattern recognition, pp 2746–2754
Tai Y, Yang J, Zhang Y, Luo L, Qian J, Chen Y (2016) Face recognition with pose variations and misalignment via orthogonal procrustes regression. IEEE Trans Image Process 25(6):2673–2683
Shah SAA, Bennamoun M, Boussaid F (2016) Iterative deep learning for image set based face and object recognition. Neurocomputing 174(Part B):866–874
Wagner A, Wright J, Ganesh A, Zhou Z, Mobahi H, Ma Y (2012) Toward a practical face recognition system: robust alignment and illumination by sparse representation. IEEE Trans Pattern Anal Mach Intell 34(2):372–386
Yu J, Hong R, Wang M, You J (2014) Image clustering based on sparse patch alignment framework. Pattern Recogn 47(11):3512–3519
Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032
Jia K, Chan T-H, Ma Y (2012) Robust and practical face recognition via structured sparsity. In: European conference on computer vision, pp 331–344
Su Y, Wang M (2015) Single-image expression invariant face recognition based on sparse representation. In: Arik S, Huang T, Lai W, Liu Q (eds) Neural information processing, vol 9492. Springer, Cham, pp 216–223
Zhuang L, Chan T-H, Yang AY, Sastry SS, Ma Y (2015) Sparse illumination learning and transfer for single-sample face recognition with image corruption and misalignment. Int J Comput Vis 114(2):272–287
Zhuang L, Yang AY, Zhou Z, Sastry SS, Ma Y (2013) Single-sample face recognition with image corruption and misalignment via sparse illumination transfer. In: IEEE conference on computer vision and pattern recognition
Goodall C (1991) Procrustes methods in the statistical analysis of shape. J R Stat Soc Ser B (Methodological) 53(2):285–339
Romdhani S, Vetter T (2003) Efficient, robust and accurate fitting of a 3D morphable model. In: IEEE international conference on computer vision, vol 1, pp 59–66
Wright J, Yang AY, Ganesh A, Sastry SS, Yi M (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Matthews I, Baker S (2004) Active appearance models revisited. Int J Comput Vis 60(2):135–164
Gross R, Matthews I, Baker S (2005) Generic vs. person specific active appearance models. Image Vis Comput 23(12):1080–1093
Xiao J, Chai J, Kanade T (2006) A closed-form solution to non-rigid shape and motion recovery. Int J Comput Vis 67(2):233–246
Kim M, Kumar S, Pavlovic V, Rowley H (2008) Face tracking and recognition with visual constraints in real-world videos. In: Computer vision and pattern recognition
Wolf L, Hassner T, Maoz I (2011) Face recognition in unconstrained videos with matched background similarity. In: Computer vision and pattern recognition, pp 529–534
Fathy ME, Patel VM, Chellappa R (2015) Face-based active authentication on mobile devices. In: International conference on acoustics, speech and signal processing, pp 1687–1691
Viola P, Jones M (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Cevikalp H, Triggs B (2010) Face recognition based on image sets. In: IEEE international conference on Computer vision and pattern recognition, pp 2567–2573
Hu Y, Mian AS, Owens R (2012) Face recognition using sparse approximated nearest points between image sets. IEEE Trans Pattern Anal Mach Intell 34(10):1992–2004
Chen Y-C, Patel VM, Phillips PJ, Chellappa R (2012) Dictionary-based face recognition from video. In: European conference on computer vision, pp 766–779
Ortiz E, Shah M (2013) Face recognition in movie trailers via mean sequence sparse representation-based classification. In: IEEE conference on computer vision and pattern recognition
Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation: which helps face recognition? In: IEEE international conference on computer vision, pp 471 –478
Zhu P, Zhang L, Zuo W, Zhang D (2013) From point to set: extend the learning of distance metrics. In: International conference on computer vision, pp 2664–2671
Hayat M, Bennamoun M, An S (2014) Learning non-linear reconstruction models for image set classification. In: Conference on computer vision and pattern recognition, pp 1915–1922
Huang Z, Wang R, Shan S, Chen X (2015) Projection metric learning on Grassmann manifold with application to video based face recognition. In: IEEE conference on computer vision and pattern recognition, pp 140–149
Huang Z, Wang R, Shan S, Li X, Chen X (2015) Log-euclidean metric learning on symmetric positive definite manifold with application to image set classification. In: International conference on machine learning
Fathy ME, Alavi A, Chellappa R (2016) Discriminative log-euclidean feature learning for sparse representation-based recognition of faces from videos. In: International joint conference on artificial intelligence, pp 3359–3367
Acknowledgements
Funding was provided by National Natural Science Foundation of China (Grant No. 61305009).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Su, Y. Robust Video Face Recognition Under Pose Variation. Neural Process Lett 47, 277–291 (2018). https://doi.org/10.1007/s11063-017-9649-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-017-9649-8