Skip to main content
Log in

An improved recurrent neural networks for 3d object reconstruction

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

3D-R2N2 and other advanced 3D reconstruction neural networks have achieved impressive results, however most of them still suffer from training difficulties and detail losing, due to their weak feature extraction capability and improper loss function. This paper aims to overcome these shortcomings and defects by building a brand new model based on 3D-R2N2. The new model adopts densely connected structure as encoder, and utilizes Chamfer Distance as loss function. The aim is to enhance the learning ability of the network for complex data, meanwhile, make the focus of the whole network rest on the reconstruction of detail structures. In addition, we also made an improved decoder by building two parallel predictor branches to make better use of the feature information and boost the network’s performance on reconstruction task. Through extensive tests, the results show that our proposed model called 3D-R2N2-V2 is slightly slower than 3D-R2N2 in predicting speed, but it can be 20% to 30% faster than 3D-R2N2 in training speed and obtain 15% and 10% better voxel IoU results on both single- and multi-view reconstruction tasks, respectively. Compared with other recent state-of-the-art methods like OGN and DRC, the reconstruction effect of our approach is also competitive.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Walker J, Harris E, Lynagh C (2018) 3D Printed Smart Molds for Sand Casting. Int J Metalcast 12 (4):785–796

    Article  Google Scholar 

  2. Heinrich M P, Blendowski M, Oktay O (2018) Ternarynet: faster deep model inference without GPUs for medical 3D segmentation using sparse and binary convolutions. Int J CARS 13(9):1–11

    Article  Google Scholar 

  3. Chang A X, Funkhouser TG (2015) ShapeNet: An Information-Rich 3D Model Repository. arXiv:1512.03012

  4. Choy C B, Xu D, Gwak J Y (2016) 3D-r2n2: A Unified Approach for Single and Multi-view 3D Object Reconstruction. Lecture Notes in Computer Science, vol 9912. Springer, Cham

    Google Scholar 

  5. Fan H, Su H, Guibas L (2017) A Point Set Generation Network for 3D Object Reconstruction from a Single Image. In: Computer Vision and Pattern Recognition (CVPR), pp 2463–2471

  6. Srinivasan G, Roy K (2019) RestoCNet: Residual Stochastic Binary Convolutional Spiking Neural Network for Memory-Efficient Neuromorphic Computing. Front Neurosci 7(4):13

    Article  Google Scholar 

  7. Chui C K, Shao-Bo L, Ding-Xuan Z (2018) Construction of neural networks for realization of localized deep learning. Front Appl Math Stat 4:12

    Article  Google Scholar 

  8. Huang G, Liu Z, Maaten L V D (2017) DenselyConnected Convolutional Networks. In: Computer Vision and Pattern Recognition (CVPR), pp 2261–2269

  9. Bo Y, Stefano R, Andrew M (2018) Dense 3D object reconstruction from a single depth view. IEEE Transactions on Pattern Analysis and Machine Intelligence 2:1–1

  10. Monszpart A, Mellado N, Brostow G J (2015) RAPTer: rebuilding man-made scenes with regular arrangements of planes. Acm Trans Graph 34(4):103

    Article  Google Scholar 

  11. Sipiran I, Gregor R, Schreck T (2014) Approximate symmetry detection in partial 3D meshes. Comput Graph Forum 33(7):131–140

    Article  Google Scholar 

  12. Tatarchenko M, Dosovitskiy A, Brox T (2017) Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs. In: International Conference on Computer Vision (ICCV), pp 2107–2115

  13. Zhang Y, Liu Z, Li X, Yu Z (2019) Data-Driven Point cloud objects completion. Sensors 19(7):1514

    Article  Google Scholar 

  14. Lee T, Turin S Y, Gosain A K et al (2018) Multi-viewstereo in the operating room allows prediction of healing complications in a patient-specific model of reconstructive surgery. J Biomech 74:202–206

    Article  Google Scholar 

  15. Häming K, Peters G (2010) The structure-from-motion reconstruction pipeline–a survey with focus on short image sequences. Kybernetika Praha 46(5):926–937

    MathSciNet  MATH  Google Scholar 

  16. Fuentes-Pacheco J, Ruiz-Ascencio J, Rendón-Mancha JM (2015) Visual simultaneous localization and mapping: a survey. Artif Intell Rev 43(1):55–81

    Article  Google Scholar 

  17. Tulsiani S, Zhou T, Efros A (2017) Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency. In: Computer Vision and Pattern Recognition (CVPR), pp 209–217

  18. Li Y, Dai A, Guibas L, Nießner M (2015) Database-Assisted Object Retrieval for Real-Time 3D Reconstruction. Comput Graph Forum 34(2):435–446

    Article  Google Scholar 

  19. Shi Y, Long P, Xu K, Huang H, Xiong Y (2016) Data-driven contextual modeling for 3d scene understanding. Comput Graph 55:55–67

    Article  Google Scholar 

  20. Luo J, Zhang J, Deng B et al (2018) 3D Face Reconstruction With Geometry Details From a Single Image. IEEE Trans Image Process 27(10):4756–4770

    Article  MathSciNet  Google Scholar 

  21. Carreira J, Vicente S, Agapito L, Batista J (2016) Lifting object detection datasets into 3d. IEEE Trans Pattern Anal Mach Intell 38(7):1342–1355

    Article  Google Scholar 

  22. Huang Q, Wang H, Koltun V (2015) Single-view reconstruction via joint analysis of image and shape collections. ACM Trans Graph (TOG) 34(4):87

    Google Scholar 

  23. Su H, Huang Q, Mitra N J, Li Y, Guibas L (2014) Estimating image depth using shape collections. ACM Trans Graph (TOG) 33(4):37

    Google Scholar 

  24. Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D ShapeNets: A Deep Representation for Volumetric Shapes. In: Computer Vision and Pattern Recognition (CVPR), pp 1912–1920

  25. Varley J, Dechant C, Richardson A, Ruales J, Allen P (2017) Shape Completion Enabled Robotic Grasping. In: Intelligent Robots and Systems (IROS), pp 2442–2447

  26. Bo Y, Rosa S, Markham A et al (2018) Dense 3D object reconstruction from a single depth view. IEEE Transactions on Pattern Analysis and Machine Intelligence:1–1

  27. Smith E, Meger D (2017) Improved adversarial systems for 3D object generation and reconstruction. Robot Learn 78(4):34–47

    Google Scholar 

  28. Abadi M, Agarwal A, Barham P (2016) Tensorflow: Large-scale Machine Learning on Heterogeneous Distributed Systems. Acm Sigplan Notices 51:1–1

    Article  Google Scholar 

  29. Kingma D, Ba J (2014) Adam: A Method for Stochastic Optimization. arXiv:http://arXiv.org/abs/1412.6980

  30. Everingham M, L Van Gool C K, Williams I, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(1):303–338

    Article  Google Scholar 

  31. Jaderberg M, Dalibard V, Osindero S (2017) Population Based Training of Neural Networks. arXiv:http://arXiv.org/abs/1711.09846

  32. Lun Z, Gadelha M, Kalogerakis E (2017) 3D Shape Reconstruction from Sketches via Multi-view Convolutional Networks. In: 3D Vision International Conference, pp 67–77

  33. Meagher D (1980) Octree encoding: A new technique for the representation, manipulation and display of arbitrary 3d objects by computer. Technical Report report number:IPL-TR-80-111

  34. Gao H, Yang Y (2019) Multi-branch fusion network for hyperspectral image classification. Knowl-Based Syst 167:11–25

    Article  Google Scholar 

Download references

Acknowledgments

This research is partially supported by China National Science Foundation (CNSF) with project ID 61672136, 61828202.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tingsong Ma.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, T., Kuang, P. & Tian, W. An improved recurrent neural networks for 3d object reconstruction. Appl Intell 50, 905–923 (2020). https://doi.org/10.1007/s10489-019-01523-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-019-01523-3

Keywords

Navigation