Abstract
The reconstruction of 3D object from a single image is an important task in the field of computer vision. In recent years, 3D reconstruction of single image using deep learning technology has achieved remarkable results. Traditional methods to reconstruct 3D object from a single image require prior knowledge and assumptions, and the reconstruction object is limited to a certain category or it is difficult to accomplish a good reconstruction from a real image. Although deep learning can solve these problems well with its own powerful learning ability, it also faces many problems. In this paper, we first discuss the challenges faced by applying the deep learning method to reconstruct 3D objects from a single image. Second, we comprehensively review encoders, decoders and training details used in 3D reconstruction of a single image. Then, the common datasets and evaluation metrics of single image 3D object reconstruction in recent years are introduced. In order to analyze the advantages and disadvantages of different 3D reconstruction methods, a series of experiments are used for comparison. In addition, we simply give some related application examples involving 3D reconstruction of a single image. Finally, we summarize this paper and discuss the future directions.
Similar content being viewed by others
References
Alldieck T, Magnor M, Bhatnagar BL, Theobalt C, Pons-Moll G (2019) Learning to reconstruct people in clothing from a single RGB camera. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1175–1186
Atick JJ, Griffin PA, Redlich AN (1996) Statistical approach to shape from shading: reconstruction of three-dimensional face surfaces from single two-dimensional images. Neural Comput 8(6):1321–1340
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Baka N, Kaptein BL, Bruijne MD, Walsum TV, Giphart WJ, Lelieveldt BPF (2011) 2D-3D shape reconstruction of the distal femur from stereo x-ray imaging using statistical shape models. Med Image Anal 15(6):840–850
Blanz V, Vetter T (1999) A morphable model for the synthesis of 3D faces. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, pp 187–194
Bronstein MM, Bruna J, Lecun Y, Szlam A, Vandergheynst P (2017) Geometric deep learning: going beyond euclidean data. IEEE Signal Process Mag 34(4):18–42
Chang AX, Funkhouser T, Guibas L et al (2015) Shapenet: an information-rich 3D model repository. arXiv preprint arXiv:1512.03012
Charles RQ, Su H, Mo K, Guibas LJ (2017) Point net: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 77–85
Chen Z, Zhang H (2019) Learning implicit fields for generative shape modeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5939–5948
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille A (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille A (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision, pp 801–818
Chen W, Ling H, Gao J, Smith E, Lehtinen J et al (2019) Learning to predict 3D objects with an interpolation-based differentiable renderer. In: Proceedings of the Advances in Neural Information Processing Systems, pp 9605–9616
Chinaev N, Chigorin A, Laptev I (2018) Mobileface: 3D face reconstruction with efficient CNN regression. In: Proceedings of the European Conference on Computer Vision, pp 15–30
Choi J, Medioni G, Lin Y, Silva L, Regina O, Pamplona M, Faltemier TC (2010) 3D face reconstruction using a single or multiple views. In: Proceedings of the International Conference on Pattern Recognition, pp 3959–3962
Choy CB, Xu D, Gwak J, Chen K, Savarese S (2016) 3D-r2n2: a unified approach for single and multi-view 3D object reconstruction. In: Proceedings of the European Conference on Computer Vision, pp 628–644
Cimpoi M, Maji S, Kokkinos I, Mohamed S, Vedaldi A (2014) Describing textures in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3606–3613
Dekhtiar J, Durupt A, Bricogne M, Eynard B, Rowson H, Kiritsis D (2018) Deep learning for big data applications in CAD and PLM–research review, opportunities and case study. Comput Ind 100:227–243
Dou P, Kakadiaris IA (2018) Multi-view 3D face reconstruction with deep recurrent neural networks. Image Vis Comput 80:80–91
Dou P, Shah K, Kakadiaris IA (2017) End-to-end 3D face reconstruction with deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5908–5917
Dovgard R, Basri R (2004) Statistical symmetric shape from shading for 3D structure recovery of faces. In: Proceedings of the European Conference on Computer Vision, pp 99–113
Eckart B, Kim K, Troccoli A, Kelly A, Kautz J (2016) Accelerated generative models for 3D point cloud data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5496–5505
Everingham M, Gool LV, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Fan H, Su H, Guibas L (2017) A point set generated network for 3D object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 605–613
Feng Y, Wu F, Shao X, Wang Y, Zhou X (2018) Joint 3D face reconstruction and dense alignment with position map regression network. In: Proceedings of the European Conference on Computer Vision, pp 534–551
Furukawa Y, Curless B, Seitz SM, Szeliski R (2010) Towards internet-scale multi-view stereo. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 1434-1441
Gadelha M, Maji S, Wang R (2017) 3D shape induction from 2D views of multiple objects. In: Proceedings of the International Conference on 3D Vision, pp 402–411
Genova K, Cole F, Maschinot A, Sarna A, Vlasic D, Freeman WT (2018) Unsupervised training for 3D morphable model regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8377–8386
Girdhar R, Fouhe DF, Rodriguez M, Gupta A (2016) Learning a predictable and generative vector representation for objects. In: Proceedings of the European Conference on Computer Vision, pp 484–499
Gkioxari G, Malik J, Johnson J (2019) Mesh r-cnn. arXiv preprint arXiv:1906.02739
Groueix T, Fisher M, Kim VG, Russell BC, Aubry M (2018) A papier-mâché approach to learning 3D surface generated. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 216–224
Gwak JY, Choy CB, Chandraker M, Garg A, Savarese S (2017) Weakly supervised 3D reconstruction with adversarial constraint. In: Proceedings of the International Conference on 3D Vision, pp 263–272
Ham H, Wesley J, Hendra H (2019) Computer vision based 3D reconstruction: a review. Int J Electr Comput Eng 9(4):2394–2402
Häne C, Tulsiani S, Malik J (2017) Hierarchical surface prediction for 3D object reconstruction. In: Proceedings of International Conference on 3D Vision, pp 76–84
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2961–2969
Hepp B, Nießner M, Hilliges O (2018) Plan3D: viewpoint and trajectory optimization for aerial multi-view stereo reconstruction. ACM Trans Graphics 38(1):1–17
Huang Q, Wang H, Koltun V (2015) Single-view reconstruction via joint analysis of image and shape collections. ACM Trans Graph 34(4):1–10
Huang S, Qi S, Zhu Y, Xiao Y, Xu Y, Zhu SC (2018) Holistic 3D scene parsing and reconstruction from a single rgb image. In: Proceedings of the European Conference on Computer Vision, pp 187–203
Huang PH, Matzen K, Kopf J, Ahuja N, HuangJB (2018) Deepmvs: learning multi-view stereopsis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2821–2830
Insafutdinov E, Dosovitskiy A (2018) Unsupervised learning of shape and pose with differentiable point clouds. In: Proceedings of the Advances in Neural Information Processing Systems, pp 2802–2812
Jack D, Pontes JK, Sridharan S et al (2018) Learning free-form deformations for 3D object reconstruction. In: Proceedings of the Asian Conference on Computer Vision, pp 317–333
Jackson AS, Bulat A, Argyriou V, Tzimiropoulos G (2017) Large pose 3D face reconstruction from a single image via direct volumetric CNN regression. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1031–1039
Jackson AS, Manafas C, Tzimiropoulos G (2018) 3D human body reconstruction from a single image via volumetric regression. In: Proceedings of the European Conference on Computer Vision, pp 64–77
Jeon Y, Kim J (2017) Active convolution: learning the shape of convolution for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4201–4209
Jiang L, Zhang J, Deng B, Li H, Liu L (2018) 3D face reconstruction with geometry details from a single image. IEEE Trans Image Process 27(10):4756–4770
Jiang L, Shi S, Qi X, Jia J (2018) Gal: geometric adversarial loss for single-view 3D-object reconstruction. In: Proceedings of the European Conference on Computer Vision, pp 802–816
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: Proceedings of the European Conference on Computer Vision, pp 694–711
Kanazawa A, Tulsiani S, Efros AA, Malik J (2018) Learning category-specific mesh reconstruction from image collections. In: Proceedings of the European Conference on Computer Vision, pp 371–386
Kar A, Tulsiani S, Carreira J, Malik J (2015) Category-specific object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1966–1974
Kato H, Harada T (2019) Learning view priors for single-view 3D reconstruction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 9778–9787
Kato H, Ushiku Y, Harada T (2018) Neural 3D mesh renderer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3907–3916
Kemelmacher-Shlizerman I (2013) Internet based morphable model. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3256–3263
Khan SH, Guo Y, Hayat M, Barnes N (2019) Unsupervised primitive discovery for improved 3D generative modeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 9739–9748
Kim J, Lee JK, Lee KM (2016) Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1646–1654
Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114
Klokov R, Lempitsky V (2017) Escape from cells: deep kd-networks for the recognition of 3D point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2380–7504
Kolotouros N, Pavlakos G, Daniilidis K (2019) Convolutional mesh regression for single-image human shape reconstruction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4501–4510
Kulon D, Wang H, Güler RA, Bronstein M, Zafeifiou S (2019) Single image 3D hand reconstruction with mesh convolutions. arXiv preprint arXiv:1905.01326
Larsen ABL, Sønderby SK, Larochelle H, Winther O (2015) Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300
Le T, Duan Y (2018) Pointgrid: a deep network for 3D shape understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 9204–9214
Ledig C, Theis L, Huszár F et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp 4681–4690
Li CL, Zaheer M, Zhang Y, Poczos B, Salakhutdinov R (2018) Point cloud gan. arXiv preprint arXiv:1810.05795
Li K, Pham T, Zhan H, Reid I (2018) Efficient dense point cloud object reconstruction using deformation vector fields. In: Proceedings of the European Conference on Computer Vision, pp 497–513
Lim JJ, Pirsiavash H, Torralba A (2013) Parsing ikea objects: Fine pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2992–2999
Lim B, Son S, Kim H, Nah S, Lee KM (2017) Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 136–144
Lin CH, Kong C, Lucey S (2018) Learning efficient point cloud generated for dense 3D object reconstruction. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, pp 7114–7121
Liu S, Li T, Chen W, Li H (2019) Soft rasterizer: a differentiable renderer for image-based 3D reasoning. arXiv preprint arXiv:1904.01786
Loh AM, Hartley RI (2005) Shape from non-homogeneous, non-stationary, anisotropic, perspective texture. In: Proceedings of the 2005 British Machine Vision Conference, pp 5:69–78
Lun Z, Gadelha M, Kalogerakis E, Maji S, Wang R (2017) 3D shape reconstruction from sketches via multi-view convolutional networks. In: Proceedings of the International Conference on 3D Vision, pp 67–77
Mandikal P, Radhakrishnan VB (2019) Dense 3D point cloud reconstruction using a deep pyramid network. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, pp 1052–1060
Mandikal P, Murthy N, Agarwal M, Babu RV (2018) 3D-lmnet: latent embedding matching for accurate and diverse 3D point cloud reconstruction from a single image. arXiv preprint arXiv:1807.07796
Mescheder L, Oechsle M, Niemeyer M, Nowozin S, Geiger A (2019) Occupancy networks: learning 3D reconstruction in function space. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4460–4470
Michalkiewicz M, Pontes JK, Jack D, Baktashmotlagh M, Eriksson A (2019) Deep level sets: implicit surface representations for 3D Shape inference. arXiv preprint arXiv:1901.06802
Montefusco LB, Lazzaro D, Papi S, Guerrini C (2010) A fast compressed sensing approach to 3D MR image reconstruction. IEEE Trans Med Imaging 30(5):1064–1075
Navaneet KL, Mandikal P, Agarwal M, Babu RV (2019) CAPNet: continuous approximation projection for 3D point cloud reconstruction using 2d supervision. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 33:8819–8826
Niu C, Li J, Xu K (2018) Im2struct: recovering 3D shape structure from a single RGB image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4521–4529
Omran M, Lassner C, Pons-Moll G, Gehler P, Schiele B (2018) Neural body fitting: unifying deep learning and model based human pose and shape estimation. In: Proceedings of the International Conference on 3D Vision, pp 484–494
Oswald MR, Töppe E, Nieuwenhuis C, Cremers D (2013) A review of geometry recovery from a single image focusing on curved object reconstruction. Innovations for Shape Analysis, pp 343–378
Park JJ, Florence P, Straub J, Newcombe R, Lovegrove S (2019) Deepsdf: learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 165–174
Pavlakos G, Zhou X, Derpanis KG, Daniilidis K (2017) Coarse-to-fine volumetric prediction for single-image 3D human pose. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7025–7034
Pavlakos G, Zhu L, Zhou X, Daniilidis K (2018) Learning to estimate 3D human pose and shape from a single color image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 459–468
Pollefeys M, Koch R, Vergauwen M, Gool LV (2000) Automated reconstruction of 3D scenes from sequences of images. ISPRS J Photogramm Remote Sens 55(4):251–267
Pontes JK, Kong C, Sridharan S, Lucey S, Eriksson A, Fookes C (2018) Image2mesh: a learning framework for single image 3D reconstruction. In: Proceedings of the Asian Conference on Computer Vision, pp 365–381
Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the Advances in Neural Information Processing Systems, pp 5099–5108
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv: 1511.06434
Rezende DJ, Eslami SMA, Mohamed S, Battaglia P, Jaderberg M, Heess N (2016) Unsupervised learning of 3D structure from images. In: Proceedings of the Advances in Neural Information Processing Systems, pp 4996–5004
Richardson E, SelaLUN M, Or-EI R, Kimmel R (2017) Learning detailed face reconstruction from a single image. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 1259 – 126
Richter SR, Roth S (2018) Matryoshka networks: predicting 3D geometry via nested shape layers. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1936–1944
Riegler G, Ulusoy AO, Geiger A (2017) Octnet: learning deep 3D representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3577–3586
Riegler G, Ulusoy AO, Bischof H, Geiger A (2017) Octnetfusion: learning depth fusion from data. In: Proceedings of the International Conference on 3D Vision, pp 57–66
Rock J, Gupta T, Thorsen J, Gwak JY, Shin D, Hoiem D (2015) Completing 3D object shape from one depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2484–2493
Samaras D, Metaxas D, Fua P, Leclerc YG (2000) Variable albedo surface reconstruction from stereo and shape from shading. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1:480–487
Saxena A, Sun M, Ng AY (2008) Make3D: learning 3D scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell 31(5):824–840
Scarselli F, Gori M, Tsoi AC (2009) The graph neural network model. IEEE Trans Neural Netw 20(1):61–80
Schönberger JL, Zheng E, Frahm JM, Pollefeys M (2016) Pixelwise view selection for unstructured multi-view stereo. In: Proceedings of the European Conference on Computer Vision, pp 501–518
Sharma S, Kumar V (2020) Voxel-based 3D face reconstruction and its application to face recognition using sequential deep learning. Multimedia Tools and Applications 1–28
Sharma A, Grau O, Fritz M (2016) Vconv-dae: deep volumetric shape learning without object labels. In: Proceedings of the European Conference on Computer Vision, pp 236–250
Shen W, Jia Y, Wu Y (2019) 3D Shape reconstruction from images in the frequency domain. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4471–4479
Shin D, Fowlkes CC, Hoiem D (2018) Pixels, voxels, and views: a study of shape representations for single view 3D object shape prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3061–3069
Shin D, Ren Z, Sudderth EB, Fowlkes CC (2019) Multi-layer depth and epipolar feature transformers for 3D scene reconstruction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 39–43
Sinha A, Unmesh A, Huang Q, Ramani K (2017) Surfnet: generating 3D shape surfaces using deep residual networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6040–6049
Smith E, Meger D (2017) Improved adversarial systems for 3D object generated and reconstruction. arXiv preprint arXiv:1707.09557
Smith E, Fujimoto S, Meger D (2018) Multi-view silhouette and depth decomposition for high resolution 3D object representation. In: Proceedings of the Advances in Neural Information Processing Systems, pp 6479–6489
Smith EJ, Fujimoto S, Romero A, Meger D (2019) GEOMetrics: exploiting geometric structure for graph-encoded objects. arXiv preprint arXiv:1901.11461
Soltani AA, Huang H, Wu J, Kulkarni TD, Tenenbaum JB (2017) Synthesizing 3D shapes via modeling multi-view depth maps and silhouettes with deep generative networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1511–1519
Song S, Xiao J (2014) Sliding shapes for 3D object detection in depth images. In: Proceedings of the European Conference on Computer Vision, pp 634–651
Song HO, Xiang Y, Jegelka S, Savarese S (2016) Deep metric learning via lifted structured feature embedding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4004–4012
Song S, Yu F, Zeng A, Chang AX, Savva M, Funkhouser T (2017) Semantic scene completion from a single depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 190–198
Sra M, Garrido-Jurado S, Schmandt C, Maes P (2016) Procedurally generated virtual reality from 3D reconstructed physical space. In: Proceedings of the 22nd ACM Conference on Virtual Reality Software and Technology, pp 191–200
Sun X, Wu J, Zhang X et al (2018) Pix3D: dataset and methods for single-image 3D shape modeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2974–2983
Tatarchenko M, Dosovitskiy A, Brox T (2017) Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2088–2096
Tatarchenko M, Richter SR, Ranftl R, Li Z, Koltun V, Brox T (2019) What do single-view 3D reconstruction networks learn?. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3405–3414
Tchapmi LP, Kosaraju V, Rezatofighi H, Reid I, Savarese S (2019) TopNet: structural point cloud decoder. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 383–392
Tran L, Liu X (2018) Nonlinear 3D face morphable model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7346–7355
Tulsiani S, Zhou T, Efros AA, Malik J (2017) Multi-view supervision for single-view reconstruction via differentiable ray consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2626–2634
Tulsiani S, Su H, Guibas LJ, Efros A, Malik J (2017) Learning shape abstractions by assembling volumetric primitives. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2635–2643
Varol G, Ceylan D, Russell B et al (2018) Bodynet: volumetric inference of 3D human body shapes. In: Proceedings of the European Conference on Computer Vision, pp 20–36
Wang F, Jiang MQ, Qian C et al (2017) Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3156–3164
Wang PS, Liu Y, Guo YX, Sun CY, Tong X (2017) O-cnn: octree-based convolutional neural networks for 3D shape analysis. ACM Trans Graph 36(4):72–81
Wang N, Zhang Y, Li Z, Fu Y, Liu W, Jiang YG (2018) Pixel2mesh: generating 3D mesh models from single rgb images. In: Proceedings of the European Conference on Computer Vision, pp 55–71
Wang PS, Sun CY, Liu Y, Tong X (2018) Adaptive o-cnn: a patch-based deep representation of 3D shapes. ACM Trans Graph 37(6):1–11
Wang H, Yang J, Liang W, Tong X (2019) Deep single-view 3D object reconstruction with visual hull embedding. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 33:8941–8948
Wang W, Ceylan D, Mech R, Neumann U (2019) 3DN: 3D deformation network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1038–1046
Wang WY, Xu Q, Ceylan D, Mech R, Neumann U (2019) Disn: deep implicit Surface network for high-quality single-view 3D reconstruction. arXiv preprint arXiv:1905.10711
Wei Y, Liu S, Zhao W, Lu J (2019) Conditional single-view shape generated for multi-view stereo reconstruction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 9651–9660
Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. In: Proceedings of the European Conference on Computer Vision, pp 499–515
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1912–1920
Wu J, Zhang C, Xue T, Freeman B, Tenenbaum J (2016) Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Proceedings of the Advances in Neural Information Processing systems, pp 82–90
Wu J, Wang Y, Xue T, Sun X, Freeman B, Tenenbaum J (2017) Marrnet: 3D shape reconstruction via 2.5D sketches. In: Proceedings of the Advances in Neural Information Processing Systems, pp 8–15
Wu J, Zhang C, Zhang X, Zhang Z, Freeman WT, Tenenbaum JB (2018) Learning shape priors for single-view 3D completion and reconstruction. In: Proceedings of the European Conference on Computer Vision, pp 673–691
Wu Y, He F, Zhang D, Li X (2018) Service-oriented feature-based data exchange for cloud-based design and manufacturing. IEEE Trans Serv Comput 11(2):341–353
Wu Y, He F, Yang Y (2020) A grid-based secure product data exchange for cloud-based collaborative design. Int J Coop Inf Syst 29(01n02):2040006
Xiang Y, Mottaghi R, Savarese S (2014) Beyond pascal: a benchmark for 3D object detection in the wild. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, pp 75–82
Xiang Y, Kim W, Chen W et al (2016) Objectnet3D: a large scale database for 3D object recognition. In: Proceedings of the European Conference on Computer Vision, pp 160–176
Xiao J, Ehinger KA, Hays J, Torralba A, Oliva A (2016) Sun database: exploring a large collection of scene categories. Int J Comput Vis 119(1):3–22
Xie H, Yao H, Sun X, Zhou S, Zhang S (2019) Pix2Vox: context-aware 3D reconstruction from single and multi-view images. arXiv preprint arXiv:1901.11153
Yan X, Yang J, Yumer E, Guo Y, Lee H (2016) Perspective transformer nets: learning single-view 3D object reconstruction without 3D supervision. In: Proceedings of the Advances in Neural Information Processing Systems, pp 1696–1704
Yang X, Wang Y, Wang Y et al (2018) Active object reconstruction using a guided view planner. arXiv preprint arXiv:1805.03081
Yang Y, Feng C, Shen Y, Tian D (2018) Foldingnet: point cloud auto-encoder via deep grid deformation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 206–215
Yang B, Lai Z, Lu X et al (2018) Learning 3D scene semantics and structure from a single depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 309–312
Yang B, Wang S, Markham A, Trigoni N (2020) Robust attentional aggregation of deep feature sets for multi-view 3D reconstruction. Int J Comput Vis 128(1):53–73
Yu L, Li X, Fu CW, Cohen-Or D, Heng PA (2018) Pu-net: point cloud upsampling network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2790–2799
Yuniarti A, Suciati N (2019) A Review of Deep Learning Techniques for 3D Reconstruction of 2D Images. In: Proceedings of the 2019 12th International Conference on Information & Communication Technology and System, pp 327–331
Zeng N, Zhang H, Song B, Liu W, Li Y, Dobaie AM (2018) Facial expression recognition via learning deep sparse autoencoders. Neurocomputing 273:643–649
Zeng W, Karaoglu S, Gevers T (2018) Inferring Point Clouds from Single Monocular Images by Depth Intermediation. arXiv preprint arXiv:1812.01402
Zhang D, He F, Han S, Li X (2016) Quantitative optimization of interoperability during feature-based data exchange. Integr Comput Aided Eng 23(1):31–50
Zhang J, Li K, Liang Y, Li N (2017) Learning 3D faces from 2D images via stacked contractive autoencoder. Neurocomputing 257:67–78
Zhang X, Zhang Z, Zhang C, Tenenbaum J, Freeman B, Wu J (2018) Learning to reconstruct shapes from unseen classes. In: Proceedings of the Advances in Neural Information Processing Systems, pp 2257–2268
Zhao R, Wang Y, Benitez-Quiroz CF, Liu Y, Martinez M (2016) Fast and precise face alignment and 3D shape reconstruction from a single 2D image. In: Proceedings of the European Conference on Computer Vision, pp 590–603
Zheng Z, Yu T, Wei Y, Dai Q, Liu Y (2019) Deephuman: 3D human reconstruction from a single image. In: Proceedings of the IEEE International Conference on Computer Vision, pp 7739–7749
Zhu H, Zuo X, Wang S, Cao X, Yang R (2019) Detailed human shape estimation from a single image by hierarchical mesh deformation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4491–4500
Zou C, Yumer E, Yang J, Ceylan D, Hoiem D (2017) 3D-prnn: generating shape primitives with recurrent neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp 900–909
Acknowledgements
The authors are highly thankful to the Development Research Center of Guangxi Relatively Sparse-populated Minorities (ID: GXRKJSZ201901), to the Natural Science Foundation of Guangxi Province (NO.2018GXNSFAA281164),This research was financially supported by the project of outstanding thousand young teachers’ training in higher education institutions of Guangxi, Guangxi Colleges and Universities Key Laboratory Breeding Base of System Control and Information Processing.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Kui Fu and Jiansheng Peng contributed equally to this work.
Rights and permissions
About this article
Cite this article
Fu, K., Peng, J., He, Q. et al. Single image 3D object reconstruction based on deep learning: A review. Multimed Tools Appl 80, 463–498 (2021). https://doi.org/10.1007/s11042-020-09722-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09722-8