ABSTRACT
Face pose estimation plays important roles in broad applications such as visual based surveillance, face authentication, human-computer intelligent interactions, etc. However, face pose estimation is also a challenge issue, especially under complicated real application environments. In this paper, we proposed a novel face pose estimation approach with integrating two multi-scale representations. The first one is multi-scale VGG-Face representations, which using VGG-Face CNN as backbone three middle scale layer outputs are extracted and go through additional transfer learning. The second one is multi-scale Curvelet representations. These two sub multi-scale representations are integrated and then several dense layers processing are added to form the entire ensemble system which is used for the prediction of face pose. The experiment results show that the proposed approach achieved mean absolute errors (MAE) of 0.33° and 0.23° for yaw and pitch angle on CAS-PEAL pose database, and achieved mean absolute errors of 3.88° and 1.98° for yaw and pitch angle on Pointing'04 database.
- Ng, J., Gong, S. G. 2002, Composite support vector machines for detection of faces across views and pose estimation. Image and Vision Computing, 20(5--6):359--368Google Scholar
- Ng, J., Gong, S. G. 1999, Multi-view face detection and pose estimation using a composite support vector machine across the view sphere. In Proceedings of International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, pp 14--21Google ScholarDigital Library
- Chen, L., Zhang, L., Hu, Y. X., Li, M. J., Zhang, H. J. 2003, Head pose estimation using fisher manifold learning. Proceedings of the IEEE International Workshop on Analysis and Modeling of Faces and Gestures, pp 203--207Google ScholarDigital Library
- Li, S. Z., Fu, Q., Gu, L., Scholkopf, B., Cheng, Y., Zhang, H. 2001, Kernel machine based learning for multi-view face detection and pose estimation. Proceedings of 8th IEEE International Conference on Computer Vision, 2: 674--679Google ScholarCross Ref
- McKenna, S. J., Gong, S. 1998, Real-time face pose estimation. Real-Time Imaging, 4(5): 333--347Google ScholarDigital Library
- Wu, J. W., Trivedi, M. M. 2008, A two-stage head pose estimation framework and evaluation. Pattern Recognition, 41(3):1138--1158Google ScholarDigital Library
- Gee, A. H., Cipolla, R. 1994, Determining the gaze of faces in images. Image and Vision Computing, 12(10):639--647Google ScholarCross Ref
- Wang, J. G., Sung, E. 2007, EM enhancement of 3D head pose estimated by point at infinity. Image and Vision Computing, 25(12):1864--1874Google ScholarDigital Library
- Canton-Ferrer, C., Casas, J. R., Pardàs, M. 2008, Head orientation estimation using particle filtering in multiview scenarios. Multimodal Technologies for Perception of Humans, 4625:317--327Google ScholarDigital Library
- Hinton, G., Salakhutdinov, R. 2006, Reducing the dimensionality of data with neural networks. Science, 313(5786):504--507Google ScholarCross Ref
- Hinton, G., Osindero, S. and The, Y. 2006, A fast learning algorithm for deep belief nets. Neural Computation, 18(7):1527--1554Google ScholarDigital Library
- Bengio, Y. 2009, Learning Deep Architectures for AI. Foundations & Trends in Machine Learning, 2(1): 1--127Google ScholarDigital Library
- Bengio, Y., Lecun, Y. 2010, Scaling learning algorithms towards AI. Large-Scale Kernel Machines, pp 321--359Google Scholar
- Lecun, Y., Bengio, Y., Hinton, G. 2015, Deep learning. Nature, 521: 436--444Google ScholarCross Ref
- Su, T. M., Cheng, F. Y., Han, Z. C., Ou, Z. Y. 2016, Pose Classification of Human Face Based on Deep Learning and Gradient Information Fusion. Journal of Data Acquisition and Processing, 31(5):941--948 (in Chinese)Google Scholar
- He, K., Zhang, X., Ren, S., Sun, J.. 2016, Deep Residual Learning for Image Recognition. IEEE Computer Society, pp 770--778.Google ScholarCross Ref
- Huang, G., Liu, Z., Maaten, L. V. D., Weinberger, K. Q. 2017, Densely Connected Convolutional Networks. In CVPRGoogle Scholar
- Ioffe, S., Szegedy, C. 2015, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on Machine Learning, pp 448--456Google ScholarDigital Library
- Candès, E. J., Donoho, D. L. 2000, Curvelets---a surprisingly effective nonadaptive representationfor objects with edges. In: Rabut C, Cohen A, Schumaker LL (eds) Curves and Surfaces. Vanderbilt University Press, Nashville, pp 105--120Google Scholar
- Candes, E. J., Guo, F. 2002, New multiscale transforms, minimum total variation synthesis: Applications to edge-preserving image reconstruction. Signal Processing, 82:1519--1543Google ScholarDigital Library
- Candes, E. J., Demanet, L., Donoho, D. L., Ying, L. X. 2006, Fast discrete curvelet transforms. Multiscale Model Simul, 5:861--899Google ScholarCross Ref
- Parkhi, O. M., Vedaldi, A., and Zisserman, A. 2015, Deep Face Recognition. British Machine Vision Conference, pp 41.1--41.12Google Scholar
- Simonyan, K., Zisserman, A. 2015, Very deep convolutional networks for large-scale image recognition. In International Conference on Learning RepresentationsGoogle Scholar
- Gao, W., Cao, B., Shan, S. G., Zhou, D. L., Zhang, X. H., Zhao, D. B. 2004, The CAS-PEAL large-scale chinese face database and baseline evaluations. http://www.jdl.ac.cn/peal/files/TechReport4CAS-PEAL-R1.pdfGoogle Scholar
- http://www.aiar.xjtu.edu.cn/groups/face/Chinese/Homepage.htmGoogle Scholar
- Gourier, N., Hall, D., Crowley, J. L. 2004, Estimating face orientation from robust detection of salient facial structures. Fg Net Workshop on Visual Observation of Deictic GesturesGoogle Scholar
- Hu, C., Gong, L., Wang, T., Liu, F., Feng, Q. 2014, An effective head pose estimation approach using Lie Algebrized Gaussians based face representation. Multimedia Tools and Applications, 73(3):1863--1884Google ScholarDigital Library
- Geng, X., Xia, Y. 2014, Head Pose Estimation Based on Multivariate Label Distribution. IEEE Conference on Computer Vision and Pattern RecognitionGoogle Scholar
- Sang, G., Chen, H., Huang, G., Zhao, Q. 2016,. Unseen head pose prediction using dense multivariate label distribution. Frontiers of Information Technology & Electronic Engineering, 17(6):516--526Google ScholarCross Ref
Index Terms
- Face pose estimation with ensemble multi-scale representations
Recommendations
Face Pose Estimation with Ensemble Multi-scale Model and Deep Learning
ICDLT '18: Proceedings of the 2018 2nd International Conference on Deep Learning TechnologiesFace pose estimation plays important roles in broad applications, including visual based surveillance, face authentication, expression automatic understanding, human-computer intelligent interactions, etc. However, face pose estimation is also a ...
Face recognition across pose: A review
One of the major challenges encountered by current face recognition techniques lies in the difficulties of handling varying poses, i.e., recognition of faces in arbitrary in-depth rotations. The face image differences caused by rotations are often ...
Multi-Task Pose-Invariant Face Recognition
Face images captured in unconstrained environments usually contain significant pose variation, which dramatically degrades the performance of algorithms designed to recognize frontal faces. This paper proposes a novel face identification framework capable ...
Comments