HiFace: Hybrid Task Learning for Face Reconstruction from Single Image

Xu, Wei; Fu, Zhihong; Chen, Zhixing; Deng, Qili; Fu, Mingtao; Zhang, Xijin; Gao, Yuan; Du, Daniel K.; Zheng, Min

doi:10.1007/978-3-031-25072-9_26

Wei Xu^10,11,
Zhihong Fu¹¹,
Zhixing Chen¹¹,
Qili Deng¹¹,
Mingtao Fu¹¹,
Xijin Zhang¹¹,
Yuan Gao¹¹,
Daniel K. Du¹¹ &
…
Min Zheng¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13805))

Included in the following conference series:

European Conference on Computer Vision

1390 Accesses

Abstract

The task of 3D face reconstruction in the WCPA challenge requires a monocular image as input and outputs 3D face geometry, which has been a prevalent field for decades. Considerable works have been published, in which PerspNet significantly outperforms the other methods under perspective projection. However, as the UV coordinates distribute unevenly, the UV mapping process introduces inevitable precision degradation in dense regions of reconstructed 3D faces. Thus, we design a vertex refinement module to overcome the precision degradation. We also design a multi-task learning module to enhance 3D features. By carefully designing and organizing the vertex refinement module and the multi-task learning module, we propose a hybrid task learning based 3D face reconstruction method called HiFace. Our HiFace achieves the 2nd place in the final official ranking of the ECCV 2022 WCPA Challenge, which demonstrates the superiority of our HiFace.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Andrus, C., et al.: FaceLab: scalable facial performance capture for visual effects. In: The Digital Production Symposium, pp. 1–3 (2020)
Google Scholar
Blanz, V., Vetter, T.: A morphable model for the synthesis of 3d faces. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, pp. 187–194 (1999)
Google Scholar
Chen, H., Wang, P., Wang, F., Tian, W., Xiong, L., Li, H.: EPro-PnP: generalized end-to-end probabilistic perspective-n-points for monocular object pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2781–2790 (2022)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Dib, A., Ahn, J., Thebault, C., Gosselin, P.H., Chevallier, L.: S2f2: self-supervised high fidelity face reconstruction from monocular image. arXiv preprint arXiv:2203.07732 (2022)
Feng, Y., Wu, F., Shao, X., Wang, Y., Zhou, X.: Joint 3D face reconstruction and dense alignment with position map regression network. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 557–574. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_33
Chapter Google Scholar
Gerig, T., et al.: Morphable face models-an open framework. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 75–82. IEEE (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Jackson, A.S., Bulat, A., Argyriou, V., Tzimiropoulos, G.: Large pose 3D face reconstruction from a single image via direct volumetric CNN regression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1031–1039 (2017)
Google Scholar
Kao, Y., et al.: Single-image 3D face reconstruction under perspective projection. arXiv preprint arXiv:2205.04126 (2022)
Liu, Y., et al.: CBNet: a novel composite backbone network architecture for object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11653–11660 (2020)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Google Scholar
Milletari, F., Navab, N., Ahmadi, S.A.: V-Net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571. IEEE (2016)
Google Scholar
Peng, C., et al.: MegDet: a large mini-batch object detector. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6181–6189 (2018)
Google Scholar
Suwajanakorn, S., Kemelmacher-Shlizerman, I., Seitz, S.M.: Total moving face reconstruction. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 796–812. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_52
Chapter Google Scholar
Thies, J., Zollhofer, M., Stamminger, M., Theobalt, C., Nießner, M.: Face2face: Real-time face capture and reenactment of RGB videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2387–2395 (2016)
Google Scholar
Wu, C., Bradley, D., Gross, M., Beeler, T.: An anatomically-constrained local deformation model for monocular face capture. ACM Trans. Graphics 35(4), 1–12 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Beijing University of Posts and Telecommunications, Beijing, China
Wei Xu
ByteDance Inc., Beijing, China
Wei Xu, Zhihong Fu, Zhixing Chen, Qili Deng, Mingtao Fu, Xijin Zhang, Yuan Gao, Daniel K. Du & Min Zheng

Authors

Wei Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zhihong Fu
View author publications
You can also search for this author in PubMed Google Scholar
Zhixing Chen
View author publications
You can also search for this author in PubMed Google Scholar
Qili Deng
View author publications
You can also search for this author in PubMed Google Scholar
Mingtao Fu
View author publications
You can also search for this author in PubMed Google Scholar
Xijin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Gao
View author publications
You can also search for this author in PubMed Google Scholar
Daniel K. Du
View author publications
You can also search for this author in PubMed Google Scholar
Min Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhihong Fu .

Editor information

Editors and Affiliations

IBM Research AI and MIT-IBM Watson AI Lab, Haifa, Israel
Leonid Karlinsky
Technion – Israel Institute of Technology, Haifa, Israel
Tomer Michaeli
Kyoto University, Kyoto, Japan
Ko Nishino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, W. et al. (2023). HiFace: Hybrid Task Learning for Face Reconstruction from Single Image. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13805. Springer, Cham. https://doi.org/10.1007/978-3-031-25072-9_26

Download citation

DOI: https://doi.org/10.1007/978-3-031-25072-9_26
Published: 18 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25071-2
Online ISBN: 978-3-031-25072-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

HiFace: Hybrid Task Learning for Face Reconstruction from Single Image