Skip to main content

3D Object Reconstruction with Deep Learning

  • Conference paper
  • First Online:
Intelligent Information Processing XII (IIP 2024)

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 704))

Included in the following conference series:

  • 31 Accesses

Abstract

Recent advancements and breakthroughs in deep learning have accelerated the rapid development in the field of computer vision. Having recorded a huge success in 2D object perception and detection, a lot of progress has also been made in 3D object reconstruction. Since humans can infer and relate better with 3D world images by just a single view 2D image of the object, it is necessary to train computers to think in 3D to achieve some key applications of computer vision. The use of deep learning in 3D object reconstruction of single-view images is rapidly evolving and recording significant results. In this research, we explore the Facebook well-known hybrid approach called Mesh R-CNN that combines voxel generation and triangular mesh reconstruction to generate 3D mesh structure of an object from a 2D single-view image. Although the reconstruction of objects with varying geometry and topology was achieved by Mesh R-CNN, the mesh quality was affected due to topological errors like self-intersection, causing non-smooth and rough mesh generation. In this research, Mesh R-CNN with Laplacian Smoothing (Mesh R-CNN-LS) was proposed to use the Laplacian smoothing and regularization algorithm to refine the non-smooth and rough mesh. The proposed Mesh R-CNN-LS helps to constrain the triangular deformation and generate a better and smoother 3D mesh. The proposed Mesh R-CNN-LS was compared with the original Mesh R-CNN on the Pix3D dataset and it showed better performance in terms of the loss and average precision score.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Gkioxari, G., Johnson, J., Malik, J.: Mesh R-CNN. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9784–9794 (2019). https://doi.org/10.1109/ICCV.2019.00988

  2. Choy, C.B., Xu, D., Gwak, J., Chen, K., Savarese, S.: 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction (2016). CoRR, abs/1604.00449. http://arxiv.org/abs/1604.00449

  3. Xie, H., Yao, H., Sun, X., Zhou, S., Zhang, S., Tong, X.: Pix2Vox: context-aware 3d reconstruction from single and multi-view images (2019). CoRR, abs/1901.11153. http://arxiv.org/abs/1901.11153

  4. Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. VisionComput. Vision 40(2), 99–121 (2000). https://doi.org/10.1023/A:1026543900054

    Article  Google Scholar 

  5. Jin, J., Patil, A.G., Zhang, H.: (Richard).: DR-KFD: a differentiable visual metric for 3d shape reconstruction (2019). CoRR, abs/1911.09204. http://arxiv.org/abs/1911.09204

  6. Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., Jiang, Y.-G.: Pixel2Mesh: generating 3d mesh models from single RGB images (2018). CoRR, abs/1804.01654. http://arxiv.org/abs/1804.01654

  7. Fu, K., Peng, J., He, Q., Zhang, H.: Single image 3D object reconstruction based on deep learning: a review. Multimedia Tools Appl. 80(1), 463–498 (2020). https://doi.org/10.1007/s11042-020-09722-8

    Article  Google Scholar 

  8. Charrada, T.B., Tabia, H., Chetouani, A., Laga, H.: Learnable triangulation for deep learning-based 3d reconstruction of objects of arbitrary topology from single RGB images (2021). CoRR, abs/2109.11844. https://arxiv.org/abs/2109.11844

  9. Nealen, A., Igarashi, T., Sorkine, O., Alexa, M.: Laplacian mesh optimization. In: Proceedings of the 4th International Conference on Computer Graphics and Interactive Techniques in Australasia and Southeast Asia, pp. 381–389 (2006). https://doi.org/10.1145/1174429.1174494

  10. Desbrun, M., Meyer, M., Schröder, P., Barr, A.H.: Implicit fairing of irregular meshes using diffusion and curvature flow. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, pp. 317–324 (1999). https://doi.org/10.1145/311535.311576

  11. He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN (2017). CoRR, abs/1703.06870. http://arxiv.org/abs/1703.06870

  12. Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3d object reconstruction from a single image (2016). CoRR, abs/1612.00603. http://arxiv.org/abs/1612.00603

  13. Sun, X., et al.: Pix3D: dataset and methods for single-image 3D shape modeling (2018). CoRR, abs/1804.04610. http://arxiv.org/abs/1804.04610

  14. Chai, J., Zeng, H., Li, A., Ngai, E.W.T.: Deep learning in computer vision: a critical review of emerging techniques and application scenarios. Mach Learn. Appl. 6, 100134 (2021). https://doi.org/10.1016/j.mlwa.2021.100134

  15. Chang, A.X., et al.: Shapenet: an information-rich 3d model repository (2015). arXiv preprint arXiv:1512.03012

  16. Wu, Y.: Monocular instance level 3d object reconstruction based on mesh R-CNN. In: 2020 5th International Conference on Information Science, Computer Technology and Transportation (ISCTT), pp. 1–6 (2020). https://doi.org/10.1109/ISCTT51595.2020.00035

  17. Hiu, J.: mAP (mean Average Precision) for object detection by Jonathan Hui. Medium (2018)

    Google Scholar 

  18. Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, vol. 2, pp. 2366–2374 (2014)

    Google Scholar 

  19. Zhou, K., et al.: Large mesh deformation using the volumetric graph Laplacian. ACM Trans. Graph. 24(3), 496–503 (2005). https://doi.org/10.1145/1073204.1073219

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported in part by Liaoning Province Applied Basic Research Program: Human-machine Fusion Intelligent Modeling and Collaborative Optimization Driven by Data and Knowledge under Grant 2023JH2/101300184. We appreciate Mr. John Files for supporting us with HPC for processing deep neural networks.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aboozar Taherkhani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Aremu, S.S., Taherkhani, A., Liu, C., Yang, S. (2024). 3D Object Reconstruction with Deep Learning. In: Shi, Z., Torresen, J., Yang, S. (eds) Intelligent Information Processing XII. IIP 2024. IFIP Advances in Information and Communication Technology, vol 704. Springer, Cham. https://doi.org/10.1007/978-3-031-57919-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-57919-6_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-57918-9

  • Online ISBN: 978-3-031-57919-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics