Skip to main content

Regression-Based Face Pose Estimation with Deep Multi-modal Feature Loss

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1257))

Abstract

Image-based face pose estimation tries to estimate the facial direction with 2D images. It provides important information for many face recognition applications. However, it is a difficult task due to complex conditions and appearances. Deep learning method used in this field has the disadvantage of ignoring the natural structures of human faces. To solve this problem, a framework is proposed in this paper to estimate face poses with regression, which is based on deep learning and multi-modal feature loss (\(M^2FL\)). Different from current loss functions using only a single type of features, the descriptive power was improved by combining multiple image features. To achieve it, hypergraph-based manifold regularization was applied. In this way, the loss of face pose estimation was reduced. Experimental results on commonly-used benchmark datasets demonstrate the performance of \(M^2FL\).

This work was partly supported in part by the National Natural Science Foundation of China (61871464 and 61836002), the Fujian Provincial Natural Science Foundation of China (2018J01573), the Foundation of Fujian Educational Committee (JAT160357), Distinguished Young Scientific Research Talents Plan in Universities of Fujian Province and the Program for New Century Excellent Talents in University of Fujian Province.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. BenAbdelkader, C.: Robust head pose estimation using supervised manifold learning. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 518–531. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15567-3_38

    Chapter  Google Scholar 

  2. Benfold, B., Reid, I.D.: Colour invariant head pose classification in low resolution video. In: British Machine Vision Conference, pp. 1–10 (2008)

    Google Scholar 

  3. Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks). In: IEEE International Conference on Computer Vision, pp. 1021–1030 (2017)

    Google Scholar 

  4. Chen, C., Odobez, J.M.: We are not contortionists: coupled adaptive learning for head and body orientation estimation in surveillance video. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1544–1551. IEEE (2012)

    Google Scholar 

  5. Ding, C., Xu, C., Tao, D.: Multi-task pose-invariant face recognition. IEEE Trans. Image Process. 24(3), 980–993 (2015)

    Article  MathSciNet  Google Scholar 

  6. Drouard, V., Horaud, R., Deleforge, A., Ba, S., Evangelidis, G.: Robust head-pose estimation based on partially-latent mixture of linear regressions. IEEE Trans. Image Process. (2016)

    Google Scholar 

  7. Du, G., Zhang, P., Liu, X.: Markerless human manipulator interface using leap motion with interval Kalman filter and improved particle filter. IEEE Trans. Ind. Inf. 12(2), 694–704 (2017)

    Article  Google Scholar 

  8. Fanelli, G., Dantone, M., Gall, J., Fossati, A., Gool, L.V.: Random forests for real time 3D face analysis. Int. J. Comput. Vis. 101(3), 437–458 (2013). https://doi.org/10.1007/s11263-012-0549-0

    Article  Google Scholar 

  9. Flohr, F., Dumitru-Guzu, M., Kooij, J.F.P., Gavrila, D.M.: A probabilistic framework for joint pedestrian head and body orientation estimation. IEEE Trans. Intell. Transp. Syst. 16(4), 1872–1882 (2015)

    Article  Google Scholar 

  10. Hong, C., Yu, J., Li, J., Chen, X.: Multi-view hypergraph learning by patch alignment framework. Neurocomputing 118, 79–86 (2013)

    Article  Google Scholar 

  11. Huang, Q.Y., Jia, C.K., Zhang, X.F., Ye, Y.M.: Learning discriminative subspace models for weakly supervised face detection. IEEE Trans. Ind. Inf. 13(6), 2956–2964 (2017)

    Article  Google Scholar 

  12. Kan, M., Kan, M., Shan, S., Shan, S., Chen, X.: Funnel-structured cascade for multi-view face detection with alignment-awareness. Neurocomputing 221, 138–145 (2016)

    Google Scholar 

  13. Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: IEEE Conference on Computer Vision & Pattern Recognition, pp. 1867–1874 (2014)

    Google Scholar 

  14. Kong, S.G., Mbouna, R.O.: Head pose estimation from a 2D face image using 3D face morphing with depth parameters. IEEE Trans. Image Process. 24(6), 1801–1808 (2015)

    Article  MathSciNet  Google Scholar 

  15. Kumar, A., Alavi, A., Chellappa, R.: KEPLER: keypoint and pose estimation of unconstrained faces by learning efficient H-CNN regressors. Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  16. Lu, F., Sugano, Y., Okabe, T., Sato, Y.: Gaze estimation from eye appearance: a head pose-free method via eye image synthesis. IEEE Trans. Image Process. 24(11), 3680–93 (2015)

    Article  MathSciNet  Google Scholar 

  17. Luo, R.C., Chen, S.Y.: Human pose estimation in 3-D space using adaptive control law with point-cloud-based limb regression approach. IEEE Trans. Ind. Inf. 12(1), 51–58 (2016)

    Article  MathSciNet  Google Scholar 

  18. Mukherjee, S.S., Robertson, N.M.: Deep head pose: gaze-direction estimation in multimodal video. IEEE Trans. Multimedia 17(11), 2094–2107 (2015)

    Article  Google Scholar 

  19. Pawiak, P., Sonicki, T., Niedwiecki, M., Tabor, Z., Rzecki, K.: Hand body language gesture recognition based on signals from specialized glove and machine learning algorithms. IEEE Trans. Ind. Inf. 12(3), 1104–1113 (2016)

    Article  Google Scholar 

  20. Rajagopal, A.K., Subramanian, R., Ricci, E., Vieriu, R.L., Lanz, O., Sebe, N., et al.: Exploring transfer learning approaches for head pose classification from multi-view surveillance images. Int. J. Comput. Vis. 109(1), 146–167 (2013). https://doi.org/10.1007/s11263-013-0692-2

  21. Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41, 121–135 (2018)

    Article  Google Scholar 

  22. Robertson, N., Reid, I.: Estimating gaze direction from low-resolution faces in video. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 402–415. Springer, Heidelberg (2006). https://doi.org/10.1007/11744047_31

    Chapter  Google Scholar 

  23. Ruiz, N., Chong, E., Rehg, J.M.: Fine-grained head pose estimation without keypoints. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018

    Google Scholar 

  24. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Computer Vision and Pattern Recognition, pp. 815–823 (2015)

    Google Scholar 

  25. Simao, M.A., Neto, P., Gibaru, O.: Unsupervised gesture segmentation by motion detection of a real-time data stream. IEEE Trans. Ind. Inf. 13(2), 473–481 (2016)

    Article  Google Scholar 

  26. Siriteerakul, T., Sugimura, D., Sato, Y.: Head pose classification from low resolution images using pairwise non-local intensity and color differences. In: Pacific-Rim Symposium on Image and Video Technology, pp. 362–369. IEEE (2010)

    Google Scholar 

  27. Vadakkepat, P., Lim, P., Silva, L.C.D., Jing, L., Ling, L.L.: Multimodal approach to human-face detection and tracking. IEEE Trans. Ind. Electron. 55(3), 1385–1393 (2008)

    Article  Google Scholar 

  28. Yan, Y., Ricci, E., Subramanian, R., Liu, G., Lanz, O., Sebe, N.: A multi-task learning framework for head pose estimation under target motion. IEEE Trans. Pattern Anal. Mach. Intell. 38(6), 1070–1083 (2016)

    Article  Google Scholar 

  29. Yang, T.Y., Chen, Y.T., Lin, Y.Y., Chuang, Y.Y.: FSA-Net: learning fine-grained structure aggregation for head pose estimation from a single image. In: The IEEE Conference on Computer Vision and Pattern Recognition (2019)

    Google Scholar 

  30. Zhang, T., Tao, D., Li, X., Yang, J.: Patch alignment for dimensionality reduction. IEEE Trans. Knowl. Data Eng. 21, 1299–1313 (2009)

    Article  Google Scholar 

  31. Zhang, T., Tao, D., Li, X., Yang, J.: Patch alignment for dimensionality reduction. IEEE Trans. Knowl. Data Eng. 21(9), 1299–1313 (2009)

    Article  Google Scholar 

  32. Zhu, X., Zhen, L., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3D solution. In: IEEE Conference on Computer Vision & Pattern Recognition, pp. 146–155 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chaoqun Hong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, Y., Hong, C., Chen, L., Zeng, Z. (2020). Regression-Based Face Pose Estimation with Deep Multi-modal Feature Loss. In: Zeng, J., Jing, W., Song, X., Lu, Z. (eds) Data Science. ICPCSEE 2020. Communications in Computer and Information Science, vol 1257. Springer, Singapore. https://doi.org/10.1007/978-981-15-7981-3_39

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-7981-3_39

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-7980-6

  • Online ISBN: 978-981-15-7981-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics