Skip to main content

Advertisement

Log in

CED-Net: contextual encoder–decoder network for 3D face reconstruction

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

This paper proposes a contextual encoder–decoder network to regress UV position map for 3D face reconstruction, named CED-Net. Specifically, the CED-Net introduces contextual information both in the shape and feature level, which is served as the crucial information. Firstly, shape context relationship is considered as the loss function constraint, in which the intrinsic Euclidean norm and vector angle similarity are computed for each contextual vector. Furthermore, to add the contextual information at the feature level, the local feature correlation modulator is incorporated into the network’s center section, which directs the network to capture the relationship between facial features from a spatial perspective. The heatmaps of the learned features demonstrate that the proposed CED-Net can pay attention to the comprehensive facial regions for reconstruction. The quantitative and qualitative experiments conducted on AFLW2000-3D and AFLW-LPFA show that the proposed method achieves superior results in both benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Amberg, B., Romdhani, S., Vetter, T.: Optimal step nonrigid ICP algorithms for surface registration. In: 2007 IEEE conference on computer vision and pattern recognition, p 1–8, (2007)

  2. Belongie, S., Malik, J., Puzicha, J.: Shape context: a new descriptor for shape matching and object recognition. Adv. Neural Inform. Process. Syst. 13 (2000)

  3. Bhagavatula, C., Zhu, C., Luu, K., Savvides, M.: Faster than real-time facial alignment: a 3d spatial transformer network approach in unconstrained poses. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3980–3989 (2017)

  4. Blanz, V., Vetter, T.: A morphable model for the synthesis of 3d faces. In: Proceedings of the 26th annual conference on Computer graphics and interactive techniques, pp 187–194 (1999)

  5. Blanz, V., Vetter, T.: Face recognition based on fitting a 3d morphable model. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1063–1074 (2003)

    Article  Google Scholar 

  6. Booth, J., Antonakos, E., Ploumpis, S., Trigeorgis, G., Panagakis, Y., Zafeiriou, S.: 3D face morphable models“ in-the-wild”. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 48–57 (2017)

  7. Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2d and 3d face alignment problem? (and a dataset of 230,000 3d facial landmarks). In: Proceedings of the IEEE International Conference on Computer Vision, pp 1021–1030 (2017)

  8. Cao, C., Hou, Q., Zhou, K.: Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans. Graph. (TOG) 33(4), 1–10 (2014)

    Google Scholar 

  9. Cao, C., Weng, Y., Lin, S., Zhou, K.: 3d shape regression for real-time facial animation. ACM Trans. Graph. (TOG) 32(4), 1–10 (2013)

    Article  Google Scholar 

  10. Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: Facewarehouse: a 3d facial expression database for visual computing. IEEE Trans. Vis. Comput. Graph. 20(3), 413–425 (2013)

    Google Scholar 

  11. Chinaev, N., Chigorin, A., Laptev, I.: Mobileface: 3d face reconstruction with efficient CNN regression. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, p 0 (2018)

  12. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16 x 16 words: transformers for image recognition at scale. (2020), arXiv preprint arXiv:2010.11929

  13. Feng, Y., Wu, F., Shao, X., Wang, Y., Zhou, X.: Joint 3d face reconstruction and dense alignment with position map regression network. In: Proceedings of the European conference on computer vision (ECCV), pp 534–551 (2018)

  14. Gecer, B., Lattas, A., Ploumpis, S., Deng, J., Papaioannou, A., Moschoglou, S., Zafeiriou, S.: Synthesizing coupled 3d face modalities by trunk-branch generative adversarial networks. In: European conference on computer vision, pp 415–433, Springer (2020)

  15. Guo, J., Zhu, X., Yang, Y., Yang, F., Lei, Z., Li, S.Z.: Towards fast, accurate and stable 3d dense face alignment. In: European Conference on Computer Vision, pp 152–168. Springer, (2020)

  16. Hao, Y., Zhu, H., Wu, K., Lin, X., Ma, L.: Salient-points-guided face alignment. Multimed. Syst. 25(5), 475–485 (2019)

    Article  Google Scholar 

  17. Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. (2017), arXiv preprint. arXiv:1704.04861

  18. Huber, P., Feng, Z.-H., Christmas, W., Kittler, J., Rätsch, M.: Fitting 3d morphable face models using local features. In: 2015 IEEE international conference on image processing (ICIP), pp 1195–1199. IEEE, (2015)

  19. Jackson, A. S., Bulat, A., Argyriou, V., Tzimiropoulos, G.: Large pose 3d face reconstruction from a single image via direct volumetric CNN regression. In: Proceedings of the IEEE international conference on computer vision, pp 1031–1039 (2017)

  20. Jiang, Y., Chang, S., Wang, Z.: Transgan: two pure transformers can make one strong gan, and that can scale up. Adv. Neural Inform. Process. Syst. 34 (2021)

  21. Jourabloo, A., Liu, X.: Pose-invariant 3d face alignment. In: Proceedings of the IEEE international conference on computer vision, pp 3694–3702 (2015)

  22. Jourabloo, A., Liu, X.: Large-pose face alignment via CNN-based dense 3d model fitting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4188–4196 (2016)

  23. Koestinger, M., Wohlhart, P., Roth, P. M., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: 2011 IEEE international conference on computer vision workshops (ICCV workshops), pp 2144–2151 (2011)

  24. Lee, Y.J., Lee, S.J., Park, K.R., Jo, J., Kim, J.: Single view-based 3d face reconstruction robust to self-occlusion. EURASIP J. Adv. Signal Process. 2012(1), 1–20 (2012)

    Article  Google Scholar 

  25. Liu, Y., Jourabloo, A., Ren, W., Liu, X.: Dense face alignment. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp 1619–1628 (2017)

  26. McDonagh, J., Tzimiropoulos, G.: Joint face detection and alignment with a deformable hough transform model. In: European Conference on Computer Vision, pp 569–580. Springer (2016)

  27. Parmar, N., Vaswani, A., Uszkoreit, J., Kaiser, L., Shazeer, N., Ku, A., Tran, D.: Image transformer. In: International Conference on Machine Learning PMLR, pp 4055–4064 (2018)

  28. Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3d face model for pose and illumination invariant face recognition. In: 2009 sixth IEEE international conference on advanced video and signal based surveillance, pp 296–301 (2009)

  29. Qin, Y., Zhao, C., Zhu, X., Wang, Z., Yu, Z., Fu, T., Zhou, F., Shi, J., Lei, Z.: Learning meta model for zero-and few-shot face anti-spoofing. In: Proceedings of the AAAI Conference on Artificial Intelligence 34, 11916–11923 (2020)

  30. Tu, X., Zhao, J., Xie, M., Jiang, Z., Balamurugan, A., Luo, Y., Zhao, Y., He, L., Ma, Z., Feng, J.: 3d face reconstruction from a single image assisted by 2d face images in the wild. IEEE Trans. Multimed. 23, 1160–1172 (2020)

    Article  Google Scholar 

  31. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Adv. Neural Inform. Process. Syst. 30 (2017)

  32. Veerasamy, B., Annadurai, S.: Video compression using hybrid hexagon search and teaching-learning-based optimization technique for 3d reconstruction. Multimed. Syst. 27(1), 45–59 (2021)

    Article  Google Scholar 

  33. Wu, F., Li, S., Zhao, T., Ngan, K.N., Sheng, L.: Cascaded regression using landmark displacement for 3d face reconstruction. Pattern Recogn. Lett. 125, 766–772 (2019)

    Article  Google Scholar 

  34. Xiong, S., Ma, L., Cheng, M., Wang, B.: Pinyin-to-chinese conversion on sentence-level for domain-specific applications using self-attention model. Multimed. Syst. 1–12 (2021)

  35. Xiong, X., De la Torre, F.: Global supervised descent method. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2664–2673 (2015)

  36. Yi, H., Li, C., Cao, Q., Shen, X., Li, S., Wang, G., Tai, Y.-W.: Mmface: a multi-metric regression network for unconstrained face reconstruction. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7663–7672 (2019)

  37. Yu, H., Cheang, C., Fu, Y., Xue, X.: Hando: a hybrid 3d hand-object reconstruction model for unknown objects. Multimed. Syst. 1–15 (2022)

  38. Yu, Z., Li, X., Niu, X., Shi, J., Zhao, G.: Face anti-spoofing with human material perception. In: European Conference on Computer Vision, pp 557–575. Springer, (2020)

  39. Zhao, Z., Liu, Q.: Former-dfer: Dynamic facial expression recognition transformer. In: Proceedings of the 29th ACM International Conference on Multimedia, pp 1553–1561 (2021)

  40. Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S. Z.: Face alignment across large poses: a 3d solution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 146–155 (2016)

  41. Zhu, X., Lei, Z., Yan, J., Yi, D., Li, S. Z.: High-fidelity pose and expression normalization for face recognition in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 787–796 (2015)

  42. Zhu, X., Liu, X., Lei, Z., Li, S.Z.: Face alignment in full pose range: a 3d total solution. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 78–92 (2017)

    Article  Google Scholar 

  43. Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable detr: deformable transformers for end-to-end object detection, (2020). arXiv preprint. arXiv:2010.04159

  44. Zollhöfer, M., Thies, J., Garrido, P., Bradley, D., Beeler, T., Pérez, P., Stamminger, M., Nießner, M., Theobalt, C.: State of the art on monocular 3d face reconstruction, tracking, and applications. Comput. Graph. Forum 37, 523–550 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Zhu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, L., Wang, S., Zhao, Z. et al. CED-Net: contextual encoder–decoder network for 3D face reconstruction. Multimedia Systems 28, 1713–1722 (2022). https://doi.org/10.1007/s00530-022-00938-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-022-00938-2

Keywords