StructureFromGAN: Single Image 3D Model Reconstruction and Photorealistic Texturing

Kniaz, Vladimir V.; Knyaz, Vladimir A.; Mizginov, Vladimir; Kozyrev, Mark; Moshkantsev, Petr

doi:10.1007/978-3-030-66096-3_40

StructureFromGAN: Single Image 3D Model Reconstruction and Photorealistic Texturing

Conference paper
First Online: 03 January 2021

1993 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12536))

Abstract

We present a generative adversarial model for single photo 3D reconstruction and high resolution texturing. Our framework leverages a neural renderer and a 3D Morphable model of an object. We train our generator on the semantic labelling-to-image translation task. This allows our model to learn rich priors about object appearance and perform all-around texture and shape reconstruction from a single image. Our new generator architecture leverages a power of StyleGAN2 model for image-to-image translation with fine texture detail at the \(1024 \times 1024\) resolution. We evaluate our framework quantitatively and qualitatively on Florence Face and Appolo Cars datasets on the tasks of car 3D reconstruction and texturing. Extensive experiments demonstrate that our framework achieves and surpasses the state-of-the-art in single photo 3D object reconstruction and texturing using 3D morphable models. We made our code publicly available (http://www.zefirus.org/StructureFromGAN).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Kato, H., Ushiku, Y., Harada, T.: Neural 3D mesh renderer. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Nguyen-Phuoc, T.H., Li, C., Balaban, S., Yang, Y.: RenderNet: a deep convolutional network for differentiable rendering from 3D shapes. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31, pp. 7891–7901. Curran Associates, Inc. (2018)
Google Scholar
Gecer, B., Ploumpis, S., Kotsia, I., Zafeiriou, S.: GANFIT: generative adversarial network fitting for high fidelity 3D face reconstruction. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1155–1164, June 2019
Google Scholar
Sengupta, S., Kanazawa, A., Castillo, C.D., Jacobs, D.W.: SfSNet: learning shape, reflectance and illuminance of faces ‘in the wild’. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6296–6305, June 2018
Google Scholar
Paysan, P., et al.: Face reconstruction from skull shapes and physical attributes. In: Denzler, J., Notni, G., Süße, H. (eds.) DAGM 2009. LNCS, vol. 5748, pp. 232–241. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03798-6_24
Chapter Google Scholar
Gerig, T., et al.: Morphable face models - an open framework. In: 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018), pp. 75–82, May 2018
Google Scholar
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: International Conference on Learning Representations (2018)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. CoRR abs/1812.04948 (2018)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Bagdanov, A.D., Del Bimbo, A., Masi, I.: The florence 2D/3D hybrid face dataset. In: Proceedings of the 2011 Joint ACM Workshop on Human Gesture and Behavior Understanding, J-HGBU 2011, pp. 79–80. ACM, New York (2011)
Google Scholar
Song, X., et al.: ApolloCar3D: a large 3D car instance understanding benchmark for autonomous driving. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019, pp. 5452–5462 (2019)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5967–5976. IEEE (2017)
Google Scholar
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Zhu, J.Y., et al.: Toward multimodal image-to-image translation. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2242–2251. IEEE (2017)
Google Scholar
Huang, X., Liu, M.-Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 179–196. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_11
Chapter Google Scholar
Kniaz, V.V., Knyaz, V.A., Hladůvka, J., Kropatsch, W.G., Mizginov, V.: ThermalGAN: multimodal color-to-thermal image translation for person re-identification in multispectral dataset. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11134, pp. 606–624. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11024-6_46
Chapter Google Scholar
Knyaz, V.A., Kniaz, V.V., Remondino, F.: Image-to-voxel model translation with conditional adversarial networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11129, pp. 601–618. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11009-3_37
Chapter Google Scholar
Kniaz, V.V., Knyaz, V.A., Remondino, F.: The point where reality meets fantasy: mixed adversarial generators for image splice detection. In: Annual Conference on Advances in Neural Information Processing Systems 2019, NeurIPS 2019, Vancouver, BC, Canada, 8–14 December 2019, vol. 32, pp. 215–226 (2019)
Google Scholar
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of styleGAN (2019)
Google Scholar
Lee, C.H., Liu, Z., Wu, L., Luo, P.: MaskGAN: towards diverse and interactive facial image manipulation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Sun, J.Z., Bhattarai, B., Kim, T.K.: MatchGAN: a self-supervised semi-supervised conditional generative adversarial network. ArXiv abs/2006.06614 (2020)
Google Scholar
Bhattarai, B., Kim, T.K.: Inducing optimal attribute representations for conditional GANs, March 2020
Google Scholar
Gecer, B., Ploumpis, S., Kotsia, I., Zafeiriou, S.: GANFIT: generative adversarial network fitting for high fidelity 3D face reconstruction. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Google Scholar
Kato, H., Harada, T.: Learning view priors for single-view 3D reconstruction. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9778–9787. Long Beach, USA, 16-20 June (2019). https://doi.org/10.1109/CVPR.2019.01001, http://openaccess.thecvf.com/content_CVPR_2019/html/Kato_Learning_View_Priors_for_Single-View_3D_Reconstruction_CVPR_2019_paper.html
Kato, H., Harada, T.: Self-supervised learning of 3D objects from natural images. arXiv (2019)
Google Scholar
Hodan, T., Barath, D., Matas, J.: EPOS: estimating 6d pose of objects with symmetries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020
Google Scholar
Sundermeyer, M., Marton, Z.C., Durner, M., Triebel, R.: Augmented autoencoders: implicit 3D orientation learning for 6D object detection. Int. J. Comput. Vis. 128(3), 714–729 (2020). https://doi.org/10.1007/s11263-019-01243-8
Article Google Scholar
Brachmann, E., Rother, C.: Visual camera re-localization from RGB and RGB-D images using DSAC. ArXiv abs/2002.12324 (2020)
Google Scholar
Balntas, V., Doumanoglou, A., Sahin, C., Sock, J., Kouskouridas, R., Kim, T.: Pose guided RGBD feature learning for 3D object pose estimation. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 3876–3884, October 2017
Google Scholar
Yuan, S., Stenger, B., Kim, T.: 3D hand pose estimation from RGB using privileged learning with depth data. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 2866–2873, October 2019
Google Scholar
Hodaň, T., et al.: Photorealistic image synthesis for object instance detection. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 66–70, September 2019
Google Scholar
Hampali, S., Rad, M., Oberweger, M., Lepetit, V.: HOnnotate: a method for 3D annotation of hand and object poses. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 770–778 (2016)
Google Scholar
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Volume 70 of Proceedings of Machine Learning Research, PMLR, 06–11 August 2017, pp. 214–223. International Convention Centre, Sydney (2017)
Google Scholar
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation (2017)
Google Scholar
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein GANs. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5767–5777. Curran Associates, Inc. (2017)
Google Scholar
Chang, A.X., et al.: ShapeNet: an information-rich 3D model repository. CoRR abs/1512.03012 (2015)
Google Scholar
Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ADE20K dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Lee, C.H., Liu, Z., Wu, L., Luo, P.: MaskGAN: towards diverse and interactive facial image manipulation (2019)
Google Scholar
Paszke, A., et al.: Automatic differentiation in PyTorch (2017)
Google Scholar
Feng, Y., Wu, F., Shao, X., Wang, Y., Zhou, X.: Joint 3D face reconstruction and dense alignment with position map regression network. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 557–574. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_33
Chapter Google Scholar
Tran, A.T., Hassner, T., Masi, I., Medioni, G.G.: Regressing robust and discriminative 3D morphable models with a very deep neural network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 1493–1502 (2017)
Google Scholar
Bagdanov, A.D., Masi, I., Del Bimbo, A.: The florence 2D/3D hybrid face datset. In: Proceedings of the ACM Multimedia International Workshop on Multimedia Access to 3D Human Objects (MA3HO 2011). ACM Press, December 2011
Google Scholar
Kanazawa, A., Tulsiani, S., Efros, A.A., Malik, J.: Learning category-specific mesh reconstruction from image collections. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 386–402. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_23
Chapter Google Scholar
Genova, K., Cole, F., Maschinot, A., Sarna, A., Vlasic, D., Freeman, W.T.: Unsupervised training for 3D morphable model regression. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018, pp. 8377–8386 (2018)
Google Scholar

Download references

Acknowledgments

The reported study was funded by the Russian Science Foundation (RSF) according to the research project N\(\mathrm {^{o}}\) 19-11-11008.

Author information

Authors and Affiliations

State Research Institute of Aviation Systems (GosNIIAS), Moscow, Russia
Vladimir V. Kniaz, Vladimir A. Knyaz, Vladimir Mizginov, Mark Kozyrev & Petr Moshkantsev
Moscow Institute of Physics and Technology (MIPT), Moscow, Russia
Vladimir V. Kniaz & Vladimir A. Knyaz

Authors

Vladimir V. Kniaz
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir A. Knyaz
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir Mizginov
View author publications
You can also search for this author in PubMed Google Scholar
Mark Kozyrev
View author publications
You can also search for this author in PubMed Google Scholar
Petr Moshkantsev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vladimir V. Kniaz .

Editor information

Editors and Affiliations

University of Clermont Auvergne, Clermont Ferrand, France
Adrien Bartoli
Università degli Studi di Udine, Udine, Italy
Andrea Fusiello

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 190 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kniaz, V.V., Knyaz, V.A., Mizginov, V., Kozyrev, M., Moshkantsev, P. (2020). StructureFromGAN: Single Image 3D Model Reconstruction and Photorealistic Texturing. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science(), vol 12536. Springer, Cham. https://doi.org/10.1007/978-3-030-66096-3_40

Download citation

DOI: https://doi.org/10.1007/978-3-030-66096-3_40
Published: 03 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66095-6
Online ISBN: 978-3-030-66096-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics