Abstract
The accuracy of head pose estimation is significant for many computer vision applications such as face recognition, driver attention detection and human-computer interaction. Most appearance-based head pose estimation works typically extract the low-dimensional face appearance features in some statistic subspaces, where the subspaces represent the underlying geometry structure of the pose space. However, there is an open problem, namely, how to effectively represent appearance-based subspace face for the head pose estimation problem. To address the problem, this paper proposes a head pose estimation approach based on the Lie Algebrized Gaussians (LAG) feature to model the pose characteristic. LAG is built on Gaussian Mixture Models (GMM), which actually not only models the distribution of local appearance features, but also captures the Lie group manifold structure of the feature space. Moreover, to keep multi-resolution structure information, LAG is operated on many subregions of the image. As a result, these properties of LAG enable it to effectively model the structure of subspace face which can lead to powerful discriminative ability for head pose estimation. After representing subspace face using the LAG, we treat the head pose estimation as a classification problem. The within-class covariance normalization (WCCN) based Support Vector Machine (SVM) classifier is employed to achieve robust performance as WCCN could reduce the within-class variabilities of the same pose. Extensive experimental analysis and comparison with both traditional and state-of-the-art algorithms on two challenging benchmarks demonstrate the effectiveness of our approach.
Similar content being viewed by others
References
Ba SO, Odobez JM (2004) A probabilistic framework for joint head tracking and pose estimation. In: IEEE international conference on pattern recognition (ICPR)
Balasubramanian V, Ye J, Panchanathan S (2007) Biased manifold embedding: a framework for person-independent head pose estimation. In: IEEE conference on computer vision and pattern recognition (CVPR)
Beymer D (1994) Face recognition under varying pose. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 756–761
Bishop CM (2006) Pattern recognition and machine learning. Springer, New York
Blanz V, Grother P, Phillips PJ, Vetter T (2005) Face recognition based on frontal views generated from non-frontal images. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 454–461
Bo L, Ren X, Fox D (2009) Kernel descriptors for visual recognition. In: Annual conference on neural information processing systems
Brown LM, Tian YL (2002) Comparative study of coarse head pose estimation. In: IEEE workshop on motion and video computing, pp 125–130
Chen D, Bourlard H, Thiran JP (2001) Text identification in complex background using svm. In: IEEE conference on computer vision and pattern recognition (CVPR)
Cusano C, Ciocca G, Schettini R (2003) Image annotation using svm. In: International Society for Optics and Photonics, pp 330–338
Dong L, Tao L, Xu G (2010) Head pose estimation using covariance of oriented gradients. In: IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp 1470–1473
Gong L, Wang T, Liu F (2009) Shape of gaussians as feature descriptors. In: IEEE computer vision and pattern recognition (CVPR), pp 2366–2371
Gong L, Chen M, Hu C (2013) Lie algebrized gaussians for image representation. ArXiv:1304.0823
Gourier N, Hall D, Crowley JL (2004) Estimating face orientation from robust detection of salient facial features. In: Proceedings of pointing 2004, ICPR, international workshop on visual observation of deictic gestures
Gourier N, Maisonnasse J, Hall D, Crowley JL (2006) Head pose estimation on low resolution images. In: CLEAR workshop, in conjunction with face and gesture
Haj M, Gonzalez J, Davis L (2012) On partial least squares in head pose estimation: how to simultaneously deal with misalignment. In: IEEE conference on computer vision and pattern recognition (CVPR)
Hatch A, Kajarekar S, Stolcke A (2006) Within-class covariance normalization for svm-based speaker recognition. In: Proceedings of ICSLP-interspeech
Horprasert T, Yacoob Y, Davis L (1996) Computing 3-d head orientation from a monocular image sequence. In: International conference on automatic face and gesture recognition, pp 242–247
Hu C, Gong L, Wang T, Feng Q (2013) Effective head pose estimation using lie algebrized gaussians. In: IEEE international conference on multimedia and expo (ICME)
Huang J, Shao X, Wechsler H (1998) Face pose discrimination using support vector machines (svm). In: IEEE international conference on pattern recognition (ICPR), pp 154–156
Kruger B, Bruns S, Sommer G (2000) Efficient head pose estimation with gabor wavelets. In: British machine vision conference (BMVC), pp 11–14
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2169–2178
Li Z, Fu Y, Yuan J, Huang TS, Wu Y (2007) Query driven localized linear discriminant models for head pose estimation. In: IEEE international conference on multimedia and expo (ICME), pp 1810–1813
Ma B, Zhang W, Shan S, Chen X, Gao W (2008) Robust head pose estimation using lgbp. In: IEEE international conference on pattern recognition (ICPR), pp 512–515
Moon H, Miller M (2004) Estimating facial pose from a sparse representation. In: IEEE international conference on image processing (ICIP), pp 75–78
Murphy-Chutorian E, Trivedi MM (2008) Hyhope: hybrid head orientation and position estimation for vision-based driver head tracking. In: IEEE intelligent vehicles symposium, pp 512–517
Murphy-Chutorian E, Trivedi MM (2009) Head pose estimation in computer vision: a survey. IEEE Trans Pattern Anal Mach Intell 31(4):607–626
Niyogi S, Freeman W (1996) Example-based head tracking. In: International conference on automatic face and gesture recognition, pp 374–378
Ranganathan A, Yang MH (2008) Online sparse matrix gaussian process regression and vision applications. In: IEEE European conference on computer vision (ECCV), pp 468–482
Reynolds DA, Quatieri TF, Dunn RB (2000) Speaker verification using adapted gaussian mixture models. Digital Signal Process 10(1):19–41
Robertson N, Reid I (2006) Estimating gaze direction from low-resolution faces in video. In: IEEE European conference on computer vision (ECCV)
Sherrah J, Gong S, Ong EJ (2001) Face distributions in similarity space under varying head pose. Image Vision Comput 19(12):807–819
Sim T, Baker S, Bsat M (2003) The cmu pose, illumination, and expression database. IEEE Trans Pattern Anal Mach Intell 25(12):1615–1618
Srinivasan S, Boyer K (2002) Head pose estimation using view based eigenspaces. In: IEEE international conference on pattern recognition (ICPR), pp 302–305
Stiefelhagen R (2002) Tracking focus of attention in meetings. In: IEEE international conference on multimodal interfaces (ICMI)
Stiefelhagen R (2004) Estimating head pose with neural networks-results on the pointing’04 icpr workshop evaluation data. In: Proceedings of pointing 2004 workshop: visual observation of deictic gestures
Tian YL, Brown L, Connell J, Pankanti S, Hampapur A, Senior A, Bolle R (2003) Absolute head pose estimation from overhead wide-angle cameras. In: IEEE international workshop on analysis and modeling of faces and gestures, pp 92–99
Tu J, Fu Y, Hu Y, Huang TS (2006) Evaluation of head pose estimation for studio data. In: CLEAR workshop, in conjunction with face and gesture, pp 281–290
Turk M, Pentland AP (1991) Face recognition using eigenfaces. In: IEEE conference on computer vision and pattern recognition (CVPR)
Voit M, Nickel K, Stiefelhagen R (2006) Neural network based head pose estimation and multi-view fusion. In: CLEAR workshop, in conjunction with face and gesture
Wang JG, Sung E (2007) Em enhancement of 3d head pose estimated by point at infinity. Image Vision Comput 25(12):1864–1874
Wu J, Trivedi M (2008) A two-stage head pose estimation framework and evaluation. Pattern Recognit 41(3):1138–1158
Wu JW, Pedersen JM, Putthividhya D, Norgaard D, Trivedi MM (2004) A two-level pose estimation framework using majority voting of gabor wavelets and bunch graph analysis. In: Proceedings of pointing 2004 workshop: visual observation of deictic gestures, pp 4–12
Yan S, Zhou X, Liu M, Hasegawa-Johnson M, Huang TS (2008) Regression from patch-kernel. In: IEEE conference on computer vision and pattern recognition (CVPR)
Zhou X, Cui N, Li Z, Liang F, Huang TS (2009) Hierarchical gaussianization for image classification. In: IEEE international conference on computer vision (ICCV), pp 1971–1977
Acknowledgements
Thank the editors and the anonymous referees for their valuable comments. This work was supported by the National Natural Science Foundation of China under grant number 61073094 and U1233119. The authors would also like to thank Xinwei Jiang for his help.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hu, C., Gong, L., Wang, T. et al. An effective head pose estimation approach using Lie Algebrized Gaussians based face representation. Multimed Tools Appl 73, 1863–1884 (2014). https://doi.org/10.1007/s11042-013-1676-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1676-5