Abstract
We present a template fitting based method for efficient markerless full head performance capture. Our method starts with high-resolution multi-view videos and efficiently outputs high-fidelity full head mesh sequences with the same topology of a common template model. A GPU accelerated stereo reconstruction firstly computes a high-quality point cloud at each frame. Then the template model is warped and fitted to the reconstructed geometries using a combination of detected landmarks constraint, nearest neighbor constraint, and volumetric regularization. Additionally, we reconstruct the detailed ear structures at the initialization frame and track the movement with a global rigid transformation assumption in the following frames. To solve the error accumulation problem when dealing with long sequences, the method updates the positions of mesh vertices using priors from the previous and initial frames in a coarse to fine manner. Summing up the above technical innovations, our method can significantly reduce the whole processing time for reconstructing topology consistent full head meshes with fine details. We conduct several experiments and demonstrate the efficiency and out-performance of the proposed method compared to previous methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Williams, L.: Performance-driven facial animation. In: Proceedings of the 17th Annual Conference on Computer Graphics and Interactive Techniques, pp. 235–242 (1990)
Weise, T., Bouaziz, S., Li, H., et al.: Realtime performance-based facial animation. ACM Trans. Graph. (TOG) 30(4), 1–10 (2011)
Beeler, T., Bickel, B., Beardsley, P., et al.: High-quality single-shot capture of facial geometry. In: ACM SIGGRAPH 2010 papers, pp. 1–9 (2010)
Beeler, T., Hahn, F., Bradley, D., et al.: High-quality passive facial performance capture using anchor frames. In: ACM SIGGRAPH 2011 papers, pp. 1–10 (2011)
Bradley, D., Heidrich, W., Popa, T., et al.: High resolution passive facial performance capture. In: ACM SIGGRAPH 2010 papers, pp. 1–10 (2010)
Cao, C., Bradley, D., Zhou, K., et al.: Real-time high-fidelity facial performance capture. ACM Trans. Graph. (ToG) 34(4), 1–9 (2015)
Riviere, J., Gotardo, P., Bradley, D., et al.: Single-shot high-quality facial geometry and skin appearance capture (2020)
Lattas, A., Moschoglou, S., Gecer, B., et al.: AvatarMe: Realistically Renderable 3D Facial Reconstruction in-the-wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 760–769 (2020)
Bao, L., Lin, X., Chen, Y., et al.: High-fidelity 3D digital human head creation from RGB-D selfies. ACM Trans. Graph. (TOG) 41(1), 1–21 (2021)
Li, T., Liu, S., Bolkart, T., et al.: Topologically Consistent Multi-View Face Inference Using Volumetric Sampling. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3824–3834 (2021)
Fyffe, G., Graham, P., Tunwattanapong, B., et al.: Near-Instant Capture of High-Resolution Facial Geometry and Reflectance. Comput. Graph. Forum 35(2), 353–363 (2016)
Fyffe, G., Jones, A., Alexander, O., et al.: Driving high-resolution facial scans with video performance capture. In: University of Southern California Los Angeles (2014)
Fyffe, G., Nagano, K., Huynh, L., et al.: Multi-View Stereo on Consistent Face Topology. Comput. Graph. Forum 36(2): 295–309 (2017)
Li, T., Bolkart, T., Black, M.J., et al.: Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph. 36(6), 194:1–194:17 (2017)
Fanelli, G., Gall, J., Romsdorfer, H., et al.: A 3D audio-visual corpus of affective communication. IEEE Trans. Multimedia 12(6), 591–598 (2010)
Zhang, X., Yin, L., Cohn, J.F., et al.: Bp4D-spontaneous: a high-resolution spontaneous 3D dynamic facial expression database. Image Vis. Comput. 32(10), 692–706 (2014)
Zhang, Z., Girard, J.M., Wu, Y., et al.: Multimodal spontaneous emotion corpus for human behavior analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3438–3446 (2016)
Weyrich, T., Matusik, W., Pfister, H., et al.: Analysis of human faces using a measurement-based skin reflectance model. ACM Trans. Graph. (ToG) 25(3), 1013–1024 (2006)
Ma, W.C., Hawkins, T., Peers, P., et al.: Rapid acquisition of specular and diffuse normal maps from polarized spherical gradient illumination. Rendering Tech. 2007(9), 10 (2007)
Cheng, S., Kotsia, I., Pantic, M., et al.: 4dfab: A large scale 4D database for facial expression analysis and biometric applications. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5117–5126 (2018)
Yang, H., Zhu, H., Wang, Y., et al.: Facescape: a large-scale high quality 3D face dataset and detailed riggable 3D face prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 601–610 (2020)
Galliani, S., Lasinger, K., Schindler, K.: Massively parallel multiview stereopsis by surface normal diffusion. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 873–881 (2015)
Donne, S., Geiger, A.: Learning non-volumetric depth fusion using successive reprojections. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7634–7643 (2019)
Tai, Y., Liang, Y., Liu, X., et al.: Towards highly accurate and stable face alignment for high-resolution videos. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33(01), 8893–8900 (2019)
Zhou, Y., Zaferiou, S.: Deformable models of ears in-the-wild for alignment and recognition. In: 2017 12th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2017). IEEE, 626–633 (2017)
Sorkine, O., Alexa, M.: As-rigid-as-possible surface modeling. In: Symposium on Geometry Processing, vol. 4, pp. 109–116 (2007)
Kwok, T.H., Yeung, K.Y., Wang, C.C.L.: Volumetric template fitting for human body reconstruction from incomplete data. J. Manuf. Syst. 33(4), 678–689 (2014)
Hang, S.: TetGen, a Delaunay-based quality tetrahedral mesh generator. ACM Trans. Math. Softw 41(2), 11 (2015)
Acknowledgment
This research was supported by the National Natural Science Foundation of China (No. 61872317).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, H., Lin, Y., Liu, X. (2022). Full Head Performance Capture Using Multi-scale Mesh Propagation. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13536. Springer, Cham. https://doi.org/10.1007/978-3-031-18913-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-18913-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18912-8
Online ISBN: 978-3-031-18913-5
eBook Packages: Computer ScienceComputer Science (R0)