Skip to main content
Log in

Robust 3D face modeling and tracking from RGB-D images

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

We address the issue of 3D face modeling and tracking from RGB-D images. Existing methods usually fit a deformable model to an RGB-D image using iterative closest point algorithm. Due to the noise and occlusion of the depth image, these methods are not robust enough. To solve this issue, we propose a method for robust 3D face modeling and tracking. For an input RGB-D face image, our method first estimates the initial head pose of a person using random forests. Then, a generic bilinear face model is fitted to the RGB-D image using iterative closest point algorithm. To improve the accuracy and robustness of face modeling, an optimal weight for each face vertex is integrated into the fitting procedure. The distances between facial landmarks are also used to better estimate facial expressions. Finally, the head pose, the identity, and expression parameters of the bilinear face model are jointly optimized. Experiments show that our method can generate accurate 3D face models from an RGB-D image or image sequence. The method can also robustly track the face even if the person is with large head rotations and various facial expressions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Baltrusaitis, T., Robinson, P., Morency, L.: 3D constrained local model for rigid and non-rigid facial tracking. In: CVPR (2012)

  2. Bouaziz, S., Wang, Y., Pauly, M.: Online modeling for realtime facial animation. ACM Trans. Graphics 32(4), 40 (2013)

    Article  Google Scholar 

  3. Cao, C., Hou, Q., Zhou, K.: Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans. Graphics 33(4), 43:1–43:10 (2014)

  4. Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: Facewarehouse: a 3D facial expression database for visual computing. IEEE Trans. Vis. Comput. Graphics 20(3), 413–425 (2014)

    Article  Google Scholar 

  5. Fanelli, G., Dantone, M., Gall, J., Fossati, A., Gool, L.: Random forests for real time 3D face analysis. Int. J. Comput. Vis. 101(3), 437–458 (2013)

    Article  Google Scholar 

  6. Fanelli, G., Weise, T., Gall, J., Gool, L.: Real time head pose estimation from consumer depth cameras. In: DAGM (2011)

  7. Feng, Y., Wu, F., Shao, X., Wang, Y., Zhou, X.: Joint 3D face reconstruction and dense alignment with position map regression network. In: ECCV (2018)

  8. Guido, B., Matteo, F., Roberto, V., Simone, C., Rita, C.: Face-from-depth for head pose estimation on depth images. IEEE Trans. Pattern Anal. Mach. Intell. 42(3), 596–609 (2020)

    Article  Google Scholar 

  9. Guo, J., Zhu, X., Yang, Y., Yang, F., Lei, Z., Li, S.Z.: Towards fast, accurate and stable 3D dense face alignment. In: European Conference on Computer Vision, pp. 152–168 (2020)

  10. Guo, Y., Zhang, J., Cai, J., Jiang, B., Zheng, J.: Cnn-based real-time dense face reconstruction with inverse rendered photorealistic face images. IEEE Trans. Pattern Anal. Mach. Intell. 41(6), 1294–1307 (2019)

    Article  Google Scholar 

  11. Guo, Y., Zhang, J., Cai, L., Cai, J., Zheng, J.: Self-supervised CNN for unconstrained 3D facial performance capture from an RGB-D camera. arXiv:1808.05323 [cs.CV] (2018)

  12. Hao, Y., Zhu, H., Wu, K., Lin, X., Ma, L.: Salient points guided face alignment. Multim. Syst. 25, 475–485 (2019)

    Article  Google Scholar 

  13. Hsieh, P., Ma, C., Yu, J., Li, H.: Unconstrained realtime facial performance capture. In: IEEE CVPR (2015)

  14. Ichim, A.E., Bouaziz, S., Pauly, M.: Dynamic 3D avatar creation from hand-held video input. ACM Trans. Graphics 34(4), 45:1–45:14 (2015)

  15. Jiang, L., Zhang, J., Deng, B., Li, H., Liu, L.: 3D face reconstruction with geometry details from a single image. IEEE Trans. Image Process. 27(10), 4756–4770 (2018)

    Article  MathSciNet  Google Scholar 

  16. Kazemi, V., Taylor, C.K.J., Kohli, P., Izadi, S.: Real-time face reconstruction from a single depth image. In: International Conference in 3D Vision (2014)

  17. Liang, L.: Precise iterative closest point algorithm for RGB-D data registration with noise and outliers. Neurocomputing 399, 361–368 (2020)

    Article  Google Scholar 

  18. Luo, C., Zhang, J., Yu, J., Chen, C.W., Wang, S.: Real-time head pose estimation and face modeling from a depth image, vol 2019. IEEE Trans. Multim 21(10), 2473–2481 (2019)

    Article  Google Scholar 

  19. Meyer, G., Gupta, S., Frosio, I., Reddy, D.: Robust model-based 3D head pose estimation. In: International Conference on Computer Vision (2015)

  20. Pham, H.X., Pavlovic, V.: Robust real-time 3D face tracking from RGBD videos under extreme pose, depth, and expression variations. In: International Conference in 3D Vision (2016)

  21. Thomas, D.: Real-time simultaneous 3D head modeling and facial motion capture with an RGB-D camera. In: arXiv:2004.10557v1 [cs.CV] (2020)

  22. Valle, R., Buenaposada, J., Baumela, L.: Multi-task head pose estimation in-the-wild. IEEE Trans. Pattern Anal. Mach. Intell. 43(8), 2874–2881 (2021)

    Article  Google Scholar 

  23. Wang, J., Zhang, J., Honda, K., Wei, J., Dang, J.: Audio-visual speech recognition integrating 3D lip information obtained from the kinect. Multim. Syst. 22, 315–323 (2016)

    Article  Google Scholar 

  24. Weise, T., Bouaziz, S., Li, H., Pauly, M.: Realtime performance-based facial animation. In: Proceedings SIGGRAPH (2011)

  25. Yu, Y., Da, F., Guo, Y.: Sparse ICP with resampling and denoising for 3D face verification. IEEE Trans. Inf. Forensics Secur. 14(7), 1917–1927 (2019)

    Article  Google Scholar 

  26. Zhan, S., Chang, L., Zhao, J., Kurihara, T., Du, H., Tang, Y., Cheng, J.: Real-time 3D face modeling based on 3D face imaging. Neurocomputing 252, 42–48 (2017)

    Article  Google Scholar 

  27. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)

    Article  Google Scholar 

  28. Zhou, Q., Park, J., Koltun, V.: Fast global registration. In: ECCV (2016)

  29. Zhu, X., Liu, X., Lei, Z., Li, S.Z.: Face alignment in full pose range: a 3D total solution. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 78–92 (2019)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the capital health research and development of special (no. 2021-1G-2182), and the state key development program in 14th Five-Year (no. 2021YFF0602103).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jing Huang or Shengjin Wang.

Additional information

Communicated by A. Liu.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, C., Zhang, J., Bao, C. et al. Robust 3D face modeling and tracking from RGB-D images. Multimedia Systems 28, 1657–1666 (2022). https://doi.org/10.1007/s00530-022-00925-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-022-00925-7

Keywords

Navigation