Abstract
3D-R2N2 and other advanced 3D reconstruction neural networks have achieved impressive results, however most of them still suffer from training difficulties and detail losing, due to their weak feature extraction capability and improper loss function. This paper aims to overcome these shortcomings and defects by building a brand new model based on 3D-R2N2. The new model adopts densely connected structure as encoder, and utilizes Chamfer Distance as loss function. The aim is to enhance the learning ability of the network for complex data, meanwhile, make the focus of the whole network rest on the reconstruction of detail structures. In addition, we also made an improved decoder by building two parallel predictor branches to make better use of the feature information and boost the network’s performance on reconstruction task. Through extensive tests, the results show that our proposed model called 3D-R2N2-V2 is slightly slower than 3D-R2N2 in predicting speed, but it can be 20% to 30% faster than 3D-R2N2 in training speed and obtain 15% and 10% better voxel IoU results on both single- and multi-view reconstruction tasks, respectively. Compared with other recent state-of-the-art methods like OGN and DRC, the reconstruction effect of our approach is also competitive.
Similar content being viewed by others
References
Walker J, Harris E, Lynagh C (2018) 3D Printed Smart Molds for Sand Casting. Int J Metalcast 12 (4):785–796
Heinrich M P, Blendowski M, Oktay O (2018) Ternarynet: faster deep model inference without GPUs for medical 3D segmentation using sparse and binary convolutions. Int J CARS 13(9):1–11
Chang A X, Funkhouser TG (2015) ShapeNet: An Information-Rich 3D Model Repository. arXiv:1512.03012
Choy C B, Xu D, Gwak J Y (2016) 3D-r2n2: A Unified Approach for Single and Multi-view 3D Object Reconstruction. Lecture Notes in Computer Science, vol 9912. Springer, Cham
Fan H, Su H, Guibas L (2017) A Point Set Generation Network for 3D Object Reconstruction from a Single Image. In: Computer Vision and Pattern Recognition (CVPR), pp 2463–2471
Srinivasan G, Roy K (2019) RestoCNet: Residual Stochastic Binary Convolutional Spiking Neural Network for Memory-Efficient Neuromorphic Computing. Front Neurosci 7(4):13
Chui C K, Shao-Bo L, Ding-Xuan Z (2018) Construction of neural networks for realization of localized deep learning. Front Appl Math Stat 4:12
Huang G, Liu Z, Maaten L V D (2017) DenselyConnected Convolutional Networks. In: Computer Vision and Pattern Recognition (CVPR), pp 2261–2269
Bo Y, Stefano R, Andrew M (2018) Dense 3D object reconstruction from a single depth view. IEEE Transactions on Pattern Analysis and Machine Intelligence 2:1–1
Monszpart A, Mellado N, Brostow G J (2015) RAPTer: rebuilding man-made scenes with regular arrangements of planes. Acm Trans Graph 34(4):103
Sipiran I, Gregor R, Schreck T (2014) Approximate symmetry detection in partial 3D meshes. Comput Graph Forum 33(7):131–140
Tatarchenko M, Dosovitskiy A, Brox T (2017) Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs. In: International Conference on Computer Vision (ICCV), pp 2107–2115
Zhang Y, Liu Z, Li X, Yu Z (2019) Data-Driven Point cloud objects completion. Sensors 19(7):1514
Lee T, Turin S Y, Gosain A K et al (2018) Multi-viewstereo in the operating room allows prediction of healing complications in a patient-specific model of reconstructive surgery. J Biomech 74:202–206
Häming K, Peters G (2010) The structure-from-motion reconstruction pipeline–a survey with focus on short image sequences. Kybernetika Praha 46(5):926–937
Fuentes-Pacheco J, Ruiz-Ascencio J, Rendón-Mancha JM (2015) Visual simultaneous localization and mapping: a survey. Artif Intell Rev 43(1):55–81
Tulsiani S, Zhou T, Efros A (2017) Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency. In: Computer Vision and Pattern Recognition (CVPR), pp 209–217
Li Y, Dai A, Guibas L, Nießner M (2015) Database-Assisted Object Retrieval for Real-Time 3D Reconstruction. Comput Graph Forum 34(2):435–446
Shi Y, Long P, Xu K, Huang H, Xiong Y (2016) Data-driven contextual modeling for 3d scene understanding. Comput Graph 55:55–67
Luo J, Zhang J, Deng B et al (2018) 3D Face Reconstruction With Geometry Details From a Single Image. IEEE Trans Image Process 27(10):4756–4770
Carreira J, Vicente S, Agapito L, Batista J (2016) Lifting object detection datasets into 3d. IEEE Trans Pattern Anal Mach Intell 38(7):1342–1355
Huang Q, Wang H, Koltun V (2015) Single-view reconstruction via joint analysis of image and shape collections. ACM Trans Graph (TOG) 34(4):87
Su H, Huang Q, Mitra N J, Li Y, Guibas L (2014) Estimating image depth using shape collections. ACM Trans Graph (TOG) 33(4):37
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D ShapeNets: A Deep Representation for Volumetric Shapes. In: Computer Vision and Pattern Recognition (CVPR), pp 1912–1920
Varley J, Dechant C, Richardson A, Ruales J, Allen P (2017) Shape Completion Enabled Robotic Grasping. In: Intelligent Robots and Systems (IROS), pp 2442–2447
Bo Y, Rosa S, Markham A et al (2018) Dense 3D object reconstruction from a single depth view. IEEE Transactions on Pattern Analysis and Machine Intelligence:1–1
Smith E, Meger D (2017) Improved adversarial systems for 3D object generation and reconstruction. Robot Learn 78(4):34–47
Abadi M, Agarwal A, Barham P (2016) Tensorflow: Large-scale Machine Learning on Heterogeneous Distributed Systems. Acm Sigplan Notices 51:1–1
Kingma D, Ba J (2014) Adam: A Method for Stochastic Optimization. arXiv:http://arXiv.org/abs/1412.6980
Everingham M, L Van Gool C K, Williams I, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(1):303–338
Jaderberg M, Dalibard V, Osindero S (2017) Population Based Training of Neural Networks. arXiv:http://arXiv.org/abs/1711.09846
Lun Z, Gadelha M, Kalogerakis E (2017) 3D Shape Reconstruction from Sketches via Multi-view Convolutional Networks. In: 3D Vision International Conference, pp 67–77
Meagher D (1980) Octree encoding: A new technique for the representation, manipulation and display of arbitrary 3d objects by computer. Technical Report report number:IPL-TR-80-111
Gao H, Yang Y (2019) Multi-branch fusion network for hyperspectral image classification. Knowl-Based Syst 167:11–25
Acknowledgments
This research is partially supported by China National Science Foundation (CNSF) with project ID 61672136, 61828202.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ma, T., Kuang, P. & Tian, W. An improved recurrent neural networks for 3d object reconstruction. Appl Intell 50, 905–923 (2020). https://doi.org/10.1007/s10489-019-01523-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-019-01523-3