Skip to main content

Investigation on the Encoder-Decoder Application for Mesh Generation

  • Conference paper
  • First Online:
Advances in Computer Graphics (CGI 2023)

Abstract

In computer graphics, 3D modeling is a fundamental concept. It is the process of creating three-dimensional objects or scenes using specialized software that allows users to create, manipulate and modify geometric shapes to build complex models. This operation requires a huge amount of time to perform and specialised knowledge. Typically, it takes three to five hours of modelling to obtain a basic mesh from the blueprint. Several approaches have tried to automate this operation to reduce modelling time. The most interesting of these approaches are based on Deep Learning, and one of the most interesting is Pixel2Mesh. However, training this network requires at least 150 epochs to obtain usable results. Starting from these premises, this work investigates the possibility of training a modified version of the Pixel2Mesh in fewer epochs to obtain comparable or better results. A modification was applied to the convolutional block to achieve this, replacing the classification-based approach with an image reconstruction-based approach. This modification uses a configuration based on constructing an encoder-decoder architecture using state-of-the-art networks such as VGG, DenseNet, ResNet, and Inception. Using this approach, the convolutional block learns how to reconstruct the image correctly from the source image by learning the position of the object of interest within the image. With this approach, it was possible to train the complete network in 50 epochs, achieving results that outperform the state-of-the-art. The tests performed on the networks show an increase of 0.5% points over the state-of-the-art average.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.tensorflow.org/graphics.

  2. 2.

    https://github.com/tensorflow/gnn.

References

  1. Verykokou, S., Ioannidis, C.: An overview on image-based and scanner-based 3D modeling technologies. Sensors 23(2), 596 (2023)

    Article  Google Scholar 

  2. Bevilacqua, M.G., Russo, M., Giordano, A., Spallone, R.: 3D reconstruction, digital twinning, and virtual reality: architectural heritage applications. In: 2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), pp. 92–96. IEEE (2022)

    Google Scholar 

  3. Moradi, M., Noor, N.F.B.M., Abdullah, R.B.H.: The effects of problem-based serious games on learning 3D computer graphics. Iran. J. Sci. Technol. Trans. Electr. Eng. 46(4), 989–1004 (2022)

    Article  Google Scholar 

  4. Huang, H., Lee, C.-F.: Factors affecting usability of 3D model learning in a virtual reality environment. Interact. Learn. Environ. 30(5), 848–861 (2022)

    Article  Google Scholar 

  5. Okura, F.: 3D modeling and reconstruction of plants and trees: a cross-cutting review across computer graphics, vision, and plant phenotyping. Breed. Sci. 72(1), 31–47 (2022)

    Article  Google Scholar 

  6. Liu, R., et al.: TMM-Nets: transferred multi- to mono-modal generation for lupus retinopathy diagnosis. IEEE Trans. Med. Imaging 42(4), 1083–1094 (2023). 36409801[PMID], ISSN 1558-254X. https://doi.org/10.1109/TMI.2022.3223683. https://pubmed.ncbi.nlm.nih.gov/36409801

  7. Xiao, B., Da, F.: Three-stage generative network for single-view point cloud completion. Vis. Comput. 38(12), 4373–4382 (2022). https://doi.org/10.1007/s00371-021-02301-4

    Article  Google Scholar 

  8. Wang, N., et al.: Pixel2mesh: generating 3D mesh models from single RGB images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 52–67 (2018)

    Google Scholar 

  9. Zhang, S., Tong, H., Xu, J., Maciejewski, R.: Graph convolutional networks: a comprehensive review. Comput. Soc. Netw. 6(1), 1–23 (2019)

    Article  Google Scholar 

  10. Wu, J., Zhang, C., Xue, T., Freeman, B., Tenenbaum, J.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

    Google Scholar 

  11. Wu, J., et al.: Marrnet: 3D shape reconstruction via 2.5D sketches. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  12. Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks, arXiv preprint arXiv:1608.04236 (2016)

  13. Guan, Y., Jahan, T., van Kaick, O.: Generalized autoencoder for volumetric shape generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 268–269 (2020)

    Google Scholar 

  14. Wu, R., Zhuang, Y., Xu, K., Zhang, H., Chen, B.: PQ-NET: a generative part Seq2Seq network for 3D shapes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 829–838 (2020)

    Google Scholar 

  15. Xie, J., et al.: Learning descriptor networks for 3D shape synthesis and analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8629–8638 (2018)

    Google Scholar 

  16. Nozawa, N., Shum, H.P.H., Feng, Q., Ho, E.S.L., Morishima, S.: 3D car shape reconstruction from a contour sketch using GAN and lazy learning. Vis. Comput. 38(4), 1317–1330 (2022). https://doi.org/10.1007/s00371-020-02024-y

    Article  Google Scholar 

  17. Wu, Z., et al.: Sagnet: structure-aware generative network for 3D-shape modeling. In: ACM Transactions Graphic Proceedings of SIGGRAPH 2019, vol. 38, no. 4, pp. 91:1–91:14 (2019)

    Google Scholar 

  18. Achlioptas, P., Diamanti, O., Mitliagkas, I., Guibas, L.: Learning representations and generative models for 3D point clouds. In: International Conference on Machine Learning, pp. 40–49. PMLR (2018)

    Google Scholar 

  19. Shu, D.W., Park, S.W., Kwon, J.: 3D point cloud generative adversarial network based on tree structured graph convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3859–3868 (2019)

    Google Scholar 

  20. Li, C.-L., Zaheer, M., Zhang, Y., Poczos, B., Salakhutdinov, R.: Point cloud GAN, arXiv preprint arXiv:1810.05795 (2018)

  21. Zamorski, M., et al.: Adversarial autoencoders for compact representations of 3D point clouds. In: Computer Vision and Image Understanding, vol. 193, p. 102921 (2020)

    Google Scholar 

  22. Gal, R., Bermano, A., Zhang, H., Cohen-Or, D.: MRGAN: multi-rooted 3D shape generation with unsupervised part disentanglement, arXiv preprint arXiv:2007.12944 (2020)

  23. Ramasinghe, S., Khan, S., Barnes, N., Gould, S.: Spectral-GANs for high-resolution 3D point-cloud generation. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 8169–8176. IEEE (2020)

    Google Scholar 

  24. Li, R., Li, X., Hui, K.-H., Fu, C.-W.: SP-GAN: sphere-guided 3D shape generation and manipulation. ACM Trans. Graph. (TOG) 40(4), 1–12 (2021)

    Article  Google Scholar 

  25. Wang, N., et al.: Pixel2mesh: generating 3D mesh models from single RGB images. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)

    Google Scholar 

  26. Wen, C., Zhang, Y., Li, Z., Fu, Y.: Pixel2mesh++: multi-view 3D mesh generation via deformation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1042–1051 (2019)

    Google Scholar 

  27. Lv, C., Lin, W., Zhao, B.: Voxel structurebased mesh reconstruction from a 3D point cloud. IEEE Trans. Multimedia 24, 1815–1829 (2021)

    Article  Google Scholar 

  28. Deng, Z., et al.: Fast 3D face reconstruction from a single image combining attention mechanism and graph convolutional network. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02679-9

  29. Brocchini, M., et al.: Monster: a deep learning-based system for the automatic generation of gaming assets. In: Mazzeo, P.L., Frontoni, E., Sclaroff, S., Distante, C. (eds.) ICIAP 2022. LNCS, vol. 13373, pp. 280–290. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-13321-3_25

    Chapter  Google Scholar 

  30. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition, arXiv arXiv:1409.1556 (2014)

  31. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  32. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  33. Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  34. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

    Google Scholar 

  35. Ravi, N., et al.: Accelerating 3D deep learning with PyTorch3D, arXiv:2007.08501 (2020)

  36. Fey, M., Lenssen, J.E.: Fast graph representation learning with PyTorch geometric. In: ICLR Workshop on Representation Learning on Graphs and Manifolds (2019)

    Google Scholar 

  37. Gao, J., et al.: GET3D: a generative model of high quality 3D textured shapes learned from images (2022). arXiv: 2209.11163

  38. Qian, G., et al.: Magic123: one image to high quality 3D object generation using both 2D and 3D diffusion priors (2023). arXiv: 2306.17843

  39. Kim, K.-S., Zhang, D., Kang, M.-C., Ko, S.-J.: Improved simple linear iterative clustering superpixels. In: 2013 IEEE International Symposium on Consumer Electronics (ISCE), pp. 259–260 (2013). https://doi.org/10.1109/ISCE.2013.6570216

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Mameli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mameli, M., Balloni, E., Mancini, A., Frontoni, E., Zingaretti, P. (2024). Investigation on the Encoder-Decoder Application for Mesh Generation. In: Sheng, B., Bi, L., Kim, J., Magnenat-Thalmann, N., Thalmann, D. (eds) Advances in Computer Graphics. CGI 2023. Lecture Notes in Computer Science, vol 14496. Springer, Cham. https://doi.org/10.1007/978-3-031-50072-5_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-50072-5_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-50071-8

  • Online ISBN: 978-3-031-50072-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics