Abstract
Image-based face pose estimation tries to estimate the facial direction with 2D images. It provides important information for many face recognition applications. However, it is a difficult task due to complex conditions and appearances. Deep learning method used in this field has the disadvantage of ignoring the natural structures of human faces. To solve this problem, a framework is proposed in this paper to estimate face poses with regression, which is based on deep learning and multi-modal feature loss (\(M^2FL\)). Different from current loss functions using only a single type of features, the descriptive power was improved by combining multiple image features. To achieve it, hypergraph-based manifold regularization was applied. In this way, the loss of face pose estimation was reduced. Experimental results on commonly-used benchmark datasets demonstrate the performance of \(M^2FL\).
This work was partly supported in part by the National Natural Science Foundation of China (61871464 and 61836002), the Fujian Provincial Natural Science Foundation of China (2018J01573), the Foundation of Fujian Educational Committee (JAT160357), Distinguished Young Scientific Research Talents Plan in Universities of Fujian Province and the Program for New Century Excellent Talents in University of Fujian Province.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
BenAbdelkader, C.: Robust head pose estimation using supervised manifold learning. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 518–531. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15567-3_38
Benfold, B., Reid, I.D.: Colour invariant head pose classification in low resolution video. In: British Machine Vision Conference, pp. 1–10 (2008)
Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks). In: IEEE International Conference on Computer Vision, pp. 1021–1030 (2017)
Chen, C., Odobez, J.M.: We are not contortionists: coupled adaptive learning for head and body orientation estimation in surveillance video. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1544–1551. IEEE (2012)
Ding, C., Xu, C., Tao, D.: Multi-task pose-invariant face recognition. IEEE Trans. Image Process. 24(3), 980–993 (2015)
Drouard, V., Horaud, R., Deleforge, A., Ba, S., Evangelidis, G.: Robust head-pose estimation based on partially-latent mixture of linear regressions. IEEE Trans. Image Process. (2016)
Du, G., Zhang, P., Liu, X.: Markerless human manipulator interface using leap motion with interval Kalman filter and improved particle filter. IEEE Trans. Ind. Inf. 12(2), 694–704 (2017)
Fanelli, G., Dantone, M., Gall, J., Fossati, A., Gool, L.V.: Random forests for real time 3D face analysis. Int. J. Comput. Vis. 101(3), 437–458 (2013). https://doi.org/10.1007/s11263-012-0549-0
Flohr, F., Dumitru-Guzu, M., Kooij, J.F.P., Gavrila, D.M.: A probabilistic framework for joint pedestrian head and body orientation estimation. IEEE Trans. Intell. Transp. Syst. 16(4), 1872–1882 (2015)
Hong, C., Yu, J., Li, J., Chen, X.: Multi-view hypergraph learning by patch alignment framework. Neurocomputing 118, 79–86 (2013)
Huang, Q.Y., Jia, C.K., Zhang, X.F., Ye, Y.M.: Learning discriminative subspace models for weakly supervised face detection. IEEE Trans. Ind. Inf. 13(6), 2956–2964 (2017)
Kan, M., Kan, M., Shan, S., Shan, S., Chen, X.: Funnel-structured cascade for multi-view face detection with alignment-awareness. Neurocomputing 221, 138–145 (2016)
Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: IEEE Conference on Computer Vision & Pattern Recognition, pp. 1867–1874 (2014)
Kong, S.G., Mbouna, R.O.: Head pose estimation from a 2D face image using 3D face morphing with depth parameters. IEEE Trans. Image Process. 24(6), 1801–1808 (2015)
Kumar, A., Alavi, A., Chellappa, R.: KEPLER: keypoint and pose estimation of unconstrained faces by learning efficient H-CNN regressors. Computer Vision and Pattern Recognition (2017)
Lu, F., Sugano, Y., Okabe, T., Sato, Y.: Gaze estimation from eye appearance: a head pose-free method via eye image synthesis. IEEE Trans. Image Process. 24(11), 3680–93 (2015)
Luo, R.C., Chen, S.Y.: Human pose estimation in 3-D space using adaptive control law with point-cloud-based limb regression approach. IEEE Trans. Ind. Inf. 12(1), 51–58 (2016)
Mukherjee, S.S., Robertson, N.M.: Deep head pose: gaze-direction estimation in multimodal video. IEEE Trans. Multimedia 17(11), 2094–2107 (2015)
Pawiak, P., Sonicki, T., Niedwiecki, M., Tabor, Z., Rzecki, K.: Hand body language gesture recognition based on signals from specialized glove and machine learning algorithms. IEEE Trans. Ind. Inf. 12(3), 1104–1113 (2016)
Rajagopal, A.K., Subramanian, R., Ricci, E., Vieriu, R.L., Lanz, O., Sebe, N., et al.: Exploring transfer learning approaches for head pose classification from multi-view surveillance images. Int. J. Comput. Vis. 109(1), 146–167 (2013). https://doi.org/10.1007/s11263-013-0692-2
Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41, 121–135 (2018)
Robertson, N., Reid, I.: Estimating gaze direction from low-resolution faces in video. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 402–415. Springer, Heidelberg (2006). https://doi.org/10.1007/11744047_31
Ruiz, N., Chong, E., Rehg, J.M.: Fine-grained head pose estimation without keypoints. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Computer Vision and Pattern Recognition, pp. 815–823 (2015)
Simao, M.A., Neto, P., Gibaru, O.: Unsupervised gesture segmentation by motion detection of a real-time data stream. IEEE Trans. Ind. Inf. 13(2), 473–481 (2016)
Siriteerakul, T., Sugimura, D., Sato, Y.: Head pose classification from low resolution images using pairwise non-local intensity and color differences. In: Pacific-Rim Symposium on Image and Video Technology, pp. 362–369. IEEE (2010)
Vadakkepat, P., Lim, P., Silva, L.C.D., Jing, L., Ling, L.L.: Multimodal approach to human-face detection and tracking. IEEE Trans. Ind. Electron. 55(3), 1385–1393 (2008)
Yan, Y., Ricci, E., Subramanian, R., Liu, G., Lanz, O., Sebe, N.: A multi-task learning framework for head pose estimation under target motion. IEEE Trans. Pattern Anal. Mach. Intell. 38(6), 1070–1083 (2016)
Yang, T.Y., Chen, Y.T., Lin, Y.Y., Chuang, Y.Y.: FSA-Net: learning fine-grained structure aggregation for head pose estimation from a single image. In: The IEEE Conference on Computer Vision and Pattern Recognition (2019)
Zhang, T., Tao, D., Li, X., Yang, J.: Patch alignment for dimensionality reduction. IEEE Trans. Knowl. Data Eng. 21, 1299–1313 (2009)
Zhang, T., Tao, D., Li, X., Yang, J.: Patch alignment for dimensionality reduction. IEEE Trans. Knowl. Data Eng. 21(9), 1299–1313 (2009)
Zhu, X., Zhen, L., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3D solution. In: IEEE Conference on Computer Vision & Pattern Recognition, pp. 146–155 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wu, Y., Hong, C., Chen, L., Zeng, Z. (2020). Regression-Based Face Pose Estimation with Deep Multi-modal Feature Loss. In: Zeng, J., Jing, W., Song, X., Lu, Z. (eds) Data Science. ICPCSEE 2020. Communications in Computer and Information Science, vol 1257. Springer, Singapore. https://doi.org/10.1007/978-981-15-7981-3_39
Download citation
DOI: https://doi.org/10.1007/978-981-15-7981-3_39
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-7980-6
Online ISBN: 978-981-15-7981-3
eBook Packages: Computer ScienceComputer Science (R0)