Three-view generation based on a single front view image for car

Qin, Zixuan; Yin, Mengxiao; Lin, Zhenfeng; Yang, Feng; Zhong, Cheng

doi:10.1007/s00371-020-01979-2

Three-view generation based on a single front view image for car

Original article
Published: 22 September 2020

Volume 37, pages 2195–2205, (2021)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Zixuan Qin¹,
Mengxiao Yin^1,2,
Zhenfeng Lin¹,
Feng Yang^1,2 &
…
Cheng Zhong^1,2

302 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

The multi-view of an object can be used for 3D reconstruction. The method proposed in this paper generates the left and the top view of a target car through deep learning. The input of the method is only a front view of a 3D car and it isn’t necessary for the depth of the 3D car. Firstly, a rough orthographic views of the 3D car is gotten from an information constraint network which is constructed by considering the structural relation between one view and the other two views. And then the rough orthographic views is transformed into large-pixel block rough view through the nearest interpolation, at the same time, the large-pixel blocks are also migrated to improve the quality of the rough orthographic views. Finally, the generative adversarial network with perception loss is used to enhance the large-pixel block view. In addition, the three views generated by the network can be used to synthesize a 3D point cloud shell.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

3D Face Reconstruction in Deep Learning Era: A Survey

Article 10 January 2022

Image Generation: A Review

Article 11 March 2022

References

Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., et al.: Shapenet: an information-rich 3D model repository. arXiv preprint arXiv:1512.03012 (2015)
Flynn, J., Neulander, I., Philbin, J., Snavely, N.: Deepstereo: learning to predict new views from the world’s imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Garg, R., BG, V.K., Carneiro, G., Reid, I.: Unsupervised CNN for single view depth estimation: geometry to the rescue. In: European Conference on Computer Vision, Springer, Berlin (2016)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Grigorev, A., Sevastopolsky, A., Vakhitov, A., Lempitsky, V.: Coordinate-based texture inpainting for pose-guided image generation. arXiv preprint arXiv:1811.11459 (2018)
Han, X., Zhang, Z., Du, D., Yang, M., Yu, J., Pan, P., Yang, X., Liu, L., Xiong, Z., Cui, S.: Deep reinforcement learning of volume-guided progressive view inpainting for 3D point scene completion from a single depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems, pp. 6626–6637 (2017)
Idesawa, M.: A system to generate a solid figure from three view. Jpn. Soc. Mech. Eng. 16, 216–225 (1973)
Article Google Scholar
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Synthetic data and artificial neural networks for natural scene text recognition. arXiv preprint arXiv:1406.2227 (2014)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision. Springer, Berlin (2016)
Kholgade, N., Simon, T., Efros, A., Sheikh, Y.: 3D object manipulation in a single photograph using stock 3D models. ACM Trans. Graph. 33(4), 127 (2014)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
Koch, S., Matveev, A., Jiang, Z., Williams, F., Artemov, A., Burnaev, E., Alexa, M., Zorin, D., Panozzo, D.: Abc: a big cad model dataset for geometric deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp 1097–1105 (2012)
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)
Article Google Scholar
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z. et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Liu, J., Rosin, P.L., Sun, X., Xiao, J., Lian, Z.: Image-driven unsupervised 3D model co-segmentation. Vis. Comput. 35(6–8), 909–920 (2019)
Article Google Scholar
Liu, S., Hu, S., Wang, G., Sun, J.: Reconstructing of 3D objects from orthographic views. Chin. J. Comput. Chin. Edition 23(2), 141–146 (2000)
MathSciNet Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)
Luciano, L., Hamza, A.B.: Deep similarity network fusion for 3D shape classification. Vis. Comput. 35(6–8), 1171–1180 (2019)
Article Google Scholar
Lučić, M., Tschannen, M., Ritter, M., Zhai, X., Bachem, O., Gelly, S.: High-fidelity image generation with fewer labels. In: International Conference on Machine Learning, pp. 4183–4192 (2019)
Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision (2017)
Olszewski, K., Tulyakov, S., Woodford, O., Li, H., Luo, L.: Transformable bottleneck networks. In: Proceedings of the IEEE International Conference on Computer Vision (2019)
Park, E., Yang, J., Yumer, E., Ceylan, D., Berg, A.C.: Transformation-grounded image generation network for novel 3D view synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Razavi, A., Oord, A.v.d., Vinyals, O.: Generating diverse high-fidelity images with vq-vae-2. arXiv preprint arXiv:1906.00446 (2019)
Rematas, K., Nguyen, C.H., Ritschel, T., Fritz, M., Tuytelaars, T.: Novel views of objects from a single image. IEEE Trans. Pattern Anal. Mach. Intell. 39(8), 1576–1590 (2016)
Article Google Scholar
Shu, Z., Qi, C., Xin, S., Hu, C., Wang, L., Zhang, Y., Liu, L.: Unsupervised 3D shape segmentation and co-segmentation via deep learning. Comput. Aided Geom. Design 43, 39–52 (2016)
Article MathSciNet Google Scholar
Shu, Z., Xin, S., Xu, X., Liu, L., Kavan, L.: Detecting 3D points of interest using multiple features and stacked auto-encoder. IEEE Trans. Vis. Comput. Graph. 25(8), 2583–2596 (2018)
Article Google Scholar
Shu, Z., Shen, X., Xin, S., Chang, Q., Feng, J., Kavan, L., Liu, L.: Scribble based 3D shape segmentation via weakly-supervised learning. IEEE Trans. Vis. Comput. Graph. 26, 2671–2682 (2019)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sitzmann, V., Thies, J., Heide, F., Nießner, M., Wetzstein, G., Zollhofer, M.: Deepvoxels: learning persistent 3D feature embeddings. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)
Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3D. ACM Trans. Graph. 25, 835–846 (2006)
Article Google Scholar
Tatarchenko, M., Dosovitskiy, A., Brox, T.: Multi-view 3D models from single images with a convolutional network. Comput. Vis. Pattern Recogn. 38, 231–257 (2016)
Google Scholar
Vogiatzis, G., Esteban, C.H., Torr, P.H., Cipolla, R.: Multiview stereo via volumetric graph-cuts and occlusion robust photo-consistency. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2241–2246 (2007)
Article Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Yang, J., Reed, S.E., Yang, M.H., Lee, H.: Weakly-supervised disentangling with recurrent transformations for 3D view synthesis. In: Advances in Neural Information Processing Systems, pp. 1099–1107 (2015)
Zhang, S., Han, Z., Lai, Y.K., Zwicker, M., Zhang, H.: Stylistic scene enhancement GAN: mixed stylistic enhancement generation for 3D indoor scenes. Vis. Comput. 35(6–8), 1157–1169 (2019)
Article Google Scholar
Zhou, T., Tulsiani, S., Sun, W., Malik, J., Efros, A.A.: View synthesis by appearance flow. In: European Conference on Computer Vision, Springer, Berlin (2016)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision (2017)

Download references

Acknowledgements

This work is partially supported by the Natural Science Foundation of China (Nos. 61762007 and 61861004), Natural Science Foundation of Guangxi Province, China (Nos. 2017GXNSFAA198269 and 2017GXNSFAA198267).

Author information

Authors and Affiliations

School of Computer and Electronic Information, Guangxi University, Nanning, China
Zixuan Qin, Mengxiao Yin, Zhenfeng Lin, Feng Yang & Cheng Zhong
Guangxi Key Laboratory of Multimedia Communications and Network Technology, Nanning, China
Mengxiao Yin, Feng Yang & Cheng Zhong

Authors

Zixuan Qin
View author publications
You can also search for this author in PubMed Google Scholar
Mengxiao Yin
View author publications
You can also search for this author in PubMed Google Scholar
Zhenfeng Lin
View author publications
You can also search for this author in PubMed Google Scholar
Feng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Zhong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mengxiao Yin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qin, Z., Yin, M., Lin, Z. et al. Three-view generation based on a single front view image for car. Vis Comput 37, 2195–2205 (2021). https://doi.org/10.1007/s00371-020-01979-2

Download citation

Accepted: 08 September 2020
Published: 22 September 2020
Issue Date: August 2021
DOI: https://doi.org/10.1007/s00371-020-01979-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Three-view generation based on a single front view image for car

Abstract

Access this article

Similar content being viewed by others

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

3D Face Reconstruction in Deep Learning Era: A Survey

Image Generation: A Review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Three-view generation based on a single front view image for car

Abstract

Access this article

Similar content being viewed by others

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

3D Face Reconstruction in Deep Learning Era: A Survey

Image Generation: A Review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation