Abstract
Increasingly attention has been paid to 3D understanding and reconstruction recently, while the inputs of most existing models are chromatic photos. 3D shape modelling from the monochromatic input, such as sketch, largely remains under-explored. One of the major challenges is the lack of paired training data, since it is costly to collect such a database with one-to-one mapping instances of two modalities, e.g., a 2D sketch and its corresponding 3D shape. In this work, we attempt to attack the problem of 3D face reconstruction using 2D sketch in an unsupervised setting. In particular, an end-to-end learning framework is proposed. There are two key modules of the network, the 2D translation network and the 3D reconstruction network. The 2D translation network is utilized to translate an input sketch face into a form of realistic chromatic 2D image. Then an unsupervised 3D reconstruction network is proposed to further transform the 2D image obtained in the previous step into a 3D face shape. In addition, because there is no existing sketch-3D face dataset available, two synthetic datasets are constructed based on BFM and CelebA, namely SynBFM and SynCelebA, to facilitate the evaluation. Extensive experiments conducted on these two synthetic datasets validate the effectiveness of our proposed approach.
The first author of this paper is a graduate student.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bagdanov, A.D., Del Bimbo, A., Masi, I.: The florence 2D/3D hybrid face dataset. In: J-HGBU (2011)
Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques (1999)
Bregler, C., Hertzmann, A., Biermann, H.: Recovering non-rigid 3D shape from image streams. In: CVPR (2000)
Chen, A., Chen, Z., Zhang, G., Mitchell, K., Yu, J.: Photo-realistic facial details synthesis from single image. In: ICCV (2019)
Chen, C., Tan, X., Wong, K.K.: Face sketch synthesis with style transfer using pyramid column feature. In: WACV (2018)
Chen, R.T., Li, X., Grosse, R., Duvenaud, D.: Isolating sources of disentanglement in variational autoencoders. arXiv preprint arXiv:1802.04942 (2018)
Deng, Y., Yang, J., Xu, S., Chen, D., Jia, Y., Tong, X.: Accurate 3D face reconstruction with weakly-supervised learning: from single image to image set. In: CVPRW (2019)
Feng, Y., Feng, H., Black, M.J., Bolkart, T.: Learning an animatable detailed 3D face model from in-the-wild images. In: TOG (2021)
Gecer, B., Ploumpis, S., Kotsia, I., Zafeiriou, S.: GANFIT: generative adversarial network fitting for high fidelity 3D face reconstruction. In: CVPR (2019)
Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: CVPR (2017)
Henzler, P., Mitra, N., Ritschel, T.: Escaping plato’s cave using adversarial training: 3D shape from unstructured 2D image collections. In: ICCV (2019)
Hu, J., et al.: Information competing process for learning diversified representations. arXiv preprint arXiv:1906.01288 (2019)
Huang, X., Liu, M.Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: ECCV (2018)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)
Karacan, L., Akata, Z., Erdem, A., Erdem, E.: Learning to generate images of outdoor scenes from attributes and semantic layouts. arXiv preprint arXiv:1612.00215 (2016)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CVPR (2019)
Kato, H., Ushiku, Y., Harada, T.: Neural 3D mesh renderer. In: CVPR (2018)
Lin, J., Yuan, Y., Shao, T., Zhou, K.: Towards high-fidelity 3D face reconstruction from in-the-wild images using graph convolutional networks. In: CVPR (2020)
Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: NIPS (2017)
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: ICCV (2015)
Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: learning 3D reconstruction in function space. In: CVPR (2019)
Milletari, F., Navab, N., Ahmadi, S.: V-Net: fully convolutional neural networks for volumetric medical image segmentation. In: 3DV (2016)
Pan, X., Dai, B., Liu, Z., Loy, C.C., Luo, P.: Do 2D GANs know 3D shape? Unsupervised 3D shape reconstruction from 2D image GANs. arXiv preprint arXiv:2011.00844 (2020)
Park, T., Efros, A.A., Zhang, R., Zhu, J.-Y.: Contrastive learning for unpaired image-to-image translation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 319–345. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_19
Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3D face model for pose and illumination invariant face recognition. In: AVSS (2009)
Sanyal, S., Bolkart, T., Feng, H., Black, M.J.: Learning to regress 3D face shape and expression from an image without 3D supervision. In: CVPR (2019)
Wang, L., Qian, C., Wang, J., Fang, Y.: Unsupervised learning of 3D model reconstruction from hand-drawn sketches. In: ACM Multimedia (2018)
Wang, X., Tang, X.: Face photo-sketch synthesis and recognition. TPAMI 31(11), 1955–1967 (2008)
Wu, S., Rupprecht, C., Vedaldi, A.: Unsupervised learning of probably symmetric deformable 3D objects from images in the wild. In: CVPR (2020)
Xiang, N., et al.: Sketch-based modeling with a differentiable renderer. Comput. Anim. Virtual Worlds 31(4–5), e1939 (2020)
Zhang, W., Wang, X., Tang, X.: Coupled information-theoretic encoding for face photo-sketch recognition. In: CVPR (2011)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)
Zhu, X., Liu, X., Lei, Z., Li, S.Z.: Face alignment in full pose range: a 3D total solution. TPAMI 41(1), 78–92 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Y., Yan, Q., Zhou, W., Liu, F. (2021). Unsupervised Learning Framework for 3D Reconstruction from Face Sketch. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13020. Springer, Cham. https://doi.org/10.1007/978-3-030-88007-1_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-88007-1_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88006-4
Online ISBN: 978-3-030-88007-1
eBook Packages: Computer ScienceComputer Science (R0)