Investigation on the Encoder-Decoder Application for Mesh Generation

Mameli, Marco; Balloni, Emanuele; Mancini, Adriano; Frontoni, Emanuele; Zingaretti, Primo

doi:10.1007/978-3-031-50072-5_31

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14496))

Included in the following conference series:

Computer Graphics International Conference

409 Accesses

Abstract

In computer graphics, 3D modeling is a fundamental concept. It is the process of creating three-dimensional objects or scenes using specialized software that allows users to create, manipulate and modify geometric shapes to build complex models. This operation requires a huge amount of time to perform and specialised knowledge. Typically, it takes three to five hours of modelling to obtain a basic mesh from the blueprint. Several approaches have tried to automate this operation to reduce modelling time. The most interesting of these approaches are based on Deep Learning, and one of the most interesting is Pixel2Mesh. However, training this network requires at least 150 epochs to obtain usable results. Starting from these premises, this work investigates the possibility of training a modified version of the Pixel2Mesh in fewer epochs to obtain comparable or better results. A modification was applied to the convolutional block to achieve this, replacing the classification-based approach with an image reconstruction-based approach. This modification uses a configuration based on constructing an encoder-decoder architecture using state-of-the-art networks such as VGG, DenseNet, ResNet, and Inception. Using this approach, the convolutional block learns how to reconstruct the image correctly from the source image by learning the position of the object of interest within the image. With this approach, it was possible to train the complete network in 50 epochs, achieving results that outperform the state-of-the-art. The tests performed on the networks show an increase of 0.5% points over the state-of-the-art average.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Verykokou, S., Ioannidis, C.: An overview on image-based and scanner-based 3D modeling technologies. Sensors 23(2), 596 (2023)
Article Google Scholar
Bevilacqua, M.G., Russo, M., Giordano, A., Spallone, R.: 3D reconstruction, digital twinning, and virtual reality: architectural heritage applications. In: 2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), pp. 92–96. IEEE (2022)
Google Scholar
Moradi, M., Noor, N.F.B.M., Abdullah, R.B.H.: The effects of problem-based serious games on learning 3D computer graphics. Iran. J. Sci. Technol. Trans. Electr. Eng. 46(4), 989–1004 (2022)
Article Google Scholar
Huang, H., Lee, C.-F.: Factors affecting usability of 3D model learning in a virtual reality environment. Interact. Learn. Environ. 30(5), 848–861 (2022)
Article Google Scholar
Okura, F.: 3D modeling and reconstruction of plants and trees: a cross-cutting review across computer graphics, vision, and plant phenotyping. Breed. Sci. 72(1), 31–47 (2022)
Article Google Scholar
Liu, R., et al.: TMM-Nets: transferred multi- to mono-modal generation for lupus retinopathy diagnosis. IEEE Trans. Med. Imaging 42(4), 1083–1094 (2023). 36409801[PMID], ISSN 1558-254X. https://doi.org/10.1109/TMI.2022.3223683. https://pubmed.ncbi.nlm.nih.gov/36409801
Xiao, B., Da, F.: Three-stage generative network for single-view point cloud completion. Vis. Comput. 38(12), 4373–4382 (2022). https://doi.org/10.1007/s00371-021-02301-4
Article Google Scholar
Wang, N., et al.: Pixel2mesh: generating 3D mesh models from single RGB images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 52–67 (2018)
Google Scholar
Zhang, S., Tong, H., Xu, J., Maciejewski, R.: Graph convolutional networks: a comprehensive review. Comput. Soc. Netw. 6(1), 1–23 (2019)
Article Google Scholar
Wu, J., Zhang, C., Xue, T., Freeman, B., Tenenbaum, J.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Wu, J., et al.: Marrnet: 3D shape reconstruction via 2.5D sketches. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks, arXiv preprint arXiv:1608.04236 (2016)
Guan, Y., Jahan, T., van Kaick, O.: Generalized autoencoder for volumetric shape generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 268–269 (2020)
Google Scholar
Wu, R., Zhuang, Y., Xu, K., Zhang, H., Chen, B.: PQ-NET: a generative part Seq2Seq network for 3D shapes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 829–838 (2020)
Google Scholar
Xie, J., et al.: Learning descriptor networks for 3D shape synthesis and analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8629–8638 (2018)
Google Scholar
Nozawa, N., Shum, H.P.H., Feng, Q., Ho, E.S.L., Morishima, S.: 3D car shape reconstruction from a contour sketch using GAN and lazy learning. Vis. Comput. 38(4), 1317–1330 (2022). https://doi.org/10.1007/s00371-020-02024-y
Article Google Scholar
Wu, Z., et al.: Sagnet: structure-aware generative network for 3D-shape modeling. In: ACM Transactions Graphic Proceedings of SIGGRAPH 2019, vol. 38, no. 4, pp. 91:1–91:14 (2019)
Google Scholar
Achlioptas, P., Diamanti, O., Mitliagkas, I., Guibas, L.: Learning representations and generative models for 3D point clouds. In: International Conference on Machine Learning, pp. 40–49. PMLR (2018)
Google Scholar
Shu, D.W., Park, S.W., Kwon, J.: 3D point cloud generative adversarial network based on tree structured graph convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3859–3868 (2019)
Google Scholar
Li, C.-L., Zaheer, M., Zhang, Y., Poczos, B., Salakhutdinov, R.: Point cloud GAN, arXiv preprint arXiv:1810.05795 (2018)
Zamorski, M., et al.: Adversarial autoencoders for compact representations of 3D point clouds. In: Computer Vision and Image Understanding, vol. 193, p. 102921 (2020)
Google Scholar
Gal, R., Bermano, A., Zhang, H., Cohen-Or, D.: MRGAN: multi-rooted 3D shape generation with unsupervised part disentanglement, arXiv preprint arXiv:2007.12944 (2020)
Ramasinghe, S., Khan, S., Barnes, N., Gould, S.: Spectral-GANs for high-resolution 3D point-cloud generation. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 8169–8176. IEEE (2020)
Google Scholar
Li, R., Li, X., Hui, K.-H., Fu, C.-W.: SP-GAN: sphere-guided 3D shape generation and manipulation. ACM Trans. Graph. (TOG) 40(4), 1–12 (2021)
Article Google Scholar
Wang, N., et al.: Pixel2mesh: generating 3D mesh models from single RGB images. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)
Google Scholar
Wen, C., Zhang, Y., Li, Z., Fu, Y.: Pixel2mesh++: multi-view 3D mesh generation via deformation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1042–1051 (2019)
Google Scholar
Lv, C., Lin, W., Zhao, B.: Voxel structurebased mesh reconstruction from a 3D point cloud. IEEE Trans. Multimedia 24, 1815–1829 (2021)
Article Google Scholar
Deng, Z., et al.: Fast 3D face reconstruction from a single image combining attention mechanism and graph convolutional network. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02679-9
Brocchini, M., et al.: Monster: a deep learning-based system for the automatic generation of gaming assets. In: Mazzeo, P.L., Frontoni, E., Sclaroff, S., Distante, C. (eds.) ICIAP 2022. LNCS, vol. 13373, pp. 280–290. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-13321-3_25
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition, arXiv arXiv:1409.1556 (2014)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Google Scholar
Ravi, N., et al.: Accelerating 3D deep learning with PyTorch3D, arXiv:2007.08501 (2020)
Fey, M., Lenssen, J.E.: Fast graph representation learning with PyTorch geometric. In: ICLR Workshop on Representation Learning on Graphs and Manifolds (2019)
Google Scholar
Gao, J., et al.: GET3D: a generative model of high quality 3D textured shapes learned from images (2022). arXiv: 2209.11163
Qian, G., et al.: Magic123: one image to high quality 3D object generation using both 2D and 3D diffusion priors (2023). arXiv: 2306.17843
Kim, K.-S., Zhang, D., Kang, M.-C., Ko, S.-J.: Improved simple linear iterative clustering superpixels. In: 2013 IEEE International Symposium on Consumer Electronics (ISCE), pp. 259–260 (2013). https://doi.org/10.1109/ISCE.2013.6570216

Download references

Author information

Authors and Affiliations

Department of Information Engineering (DII), Università Politecnica delle Marche, Via Brecce Bianche 12, Ancona, Italy
Marco Mameli, Emanuele Balloni, Adriano Mancini & Primo Zingaretti
Department of Political Sciences, Communication and International Relations (SPOCRI), Università degli Studi di Macerata, Via Don Minzoni 22/A, Macerata, Italy
Emanuele Frontoni

Authors

Marco Mameli
View author publications
You can also search for this author in PubMed Google Scholar
Emanuele Balloni
View author publications
You can also search for this author in PubMed Google Scholar
Adriano Mancini
View author publications
You can also search for this author in PubMed Google Scholar
Emanuele Frontoni
View author publications
You can also search for this author in PubMed Google Scholar
Primo Zingaretti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Mameli .

Editor information

Editors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Bin Sheng
Shanghai Jiao Tong University, Shanghai, China
Lei Bi
University of Sydney, Sydney, NSW, Australia
Jinman Kim
MIRALab-CUI, University of Geneve, Carouge, Geneve, Switzerland
Nadia Magnenat-Thalmann
Swiss Federal Institute of Technology, Lausanne, Switzerland
Daniel Thalmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mameli, M., Balloni, E., Mancini, A., Frontoni, E., Zingaretti, P. (2024). Investigation on the Encoder-Decoder Application for Mesh Generation. In: Sheng, B., Bi, L., Kim, J., Magnenat-Thalmann, N., Thalmann, D. (eds) Advances in Computer Graphics. CGI 2023. Lecture Notes in Computer Science, vol 14496. Springer, Cham. https://doi.org/10.1007/978-3-031-50072-5_31

Download citation

DOI: https://doi.org/10.1007/978-3-031-50072-5_31
Published: 29 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50071-8
Online ISBN: 978-3-031-50072-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Investigation on the Encoder-Decoder Application for Mesh Generation