Abstract
In computer graphics, 3D modeling is a fundamental concept. It is the process of creating three-dimensional objects or scenes using specialized software that allows users to create, manipulate and modify geometric shapes to build complex models. This operation requires a huge amount of time to perform and specialised knowledge. Typically, it takes three to five hours of modelling to obtain a basic mesh from the blueprint. Several approaches have tried to automate this operation to reduce modelling time. The most interesting of these approaches are based on Deep Learning, and one of the most interesting is Pixel2Mesh. However, training this network requires at least 150 epochs to obtain usable results. Starting from these premises, this work investigates the possibility of training a modified version of the Pixel2Mesh in fewer epochs to obtain comparable or better results. A modification was applied to the convolutional block to achieve this, replacing the classification-based approach with an image reconstruction-based approach. This modification uses a configuration based on constructing an encoder-decoder architecture using state-of-the-art networks such as VGG, DenseNet, ResNet, and Inception. Using this approach, the convolutional block learns how to reconstruct the image correctly from the source image by learning the position of the object of interest within the image. With this approach, it was possible to train the complete network in 50 epochs, achieving results that outperform the state-of-the-art. The tests performed on the networks show an increase of 0.5% points over the state-of-the-art average.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Verykokou, S., Ioannidis, C.: An overview on image-based and scanner-based 3D modeling technologies. Sensors 23(2), 596 (2023)
Bevilacqua, M.G., Russo, M., Giordano, A., Spallone, R.: 3D reconstruction, digital twinning, and virtual reality: architectural heritage applications. In: 2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), pp. 92–96. IEEE (2022)
Moradi, M., Noor, N.F.B.M., Abdullah, R.B.H.: The effects of problem-based serious games on learning 3D computer graphics. Iran. J. Sci. Technol. Trans. Electr. Eng. 46(4), 989–1004 (2022)
Huang, H., Lee, C.-F.: Factors affecting usability of 3D model learning in a virtual reality environment. Interact. Learn. Environ. 30(5), 848–861 (2022)
Okura, F.: 3D modeling and reconstruction of plants and trees: a cross-cutting review across computer graphics, vision, and plant phenotyping. Breed. Sci. 72(1), 31–47 (2022)
Liu, R., et al.: TMM-Nets: transferred multi- to mono-modal generation for lupus retinopathy diagnosis. IEEE Trans. Med. Imaging 42(4), 1083–1094 (2023). 36409801[PMID], ISSN 1558-254X. https://doi.org/10.1109/TMI.2022.3223683. https://pubmed.ncbi.nlm.nih.gov/36409801
Xiao, B., Da, F.: Three-stage generative network for single-view point cloud completion. Vis. Comput. 38(12), 4373–4382 (2022). https://doi.org/10.1007/s00371-021-02301-4
Wang, N., et al.: Pixel2mesh: generating 3D mesh models from single RGB images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 52–67 (2018)
Zhang, S., Tong, H., Xu, J., Maciejewski, R.: Graph convolutional networks: a comprehensive review. Comput. Soc. Netw. 6(1), 1–23 (2019)
Wu, J., Zhang, C., Xue, T., Freeman, B., Tenenbaum, J.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Wu, J., et al.: Marrnet: 3D shape reconstruction via 2.5D sketches. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks, arXiv preprint arXiv:1608.04236 (2016)
Guan, Y., Jahan, T., van Kaick, O.: Generalized autoencoder for volumetric shape generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 268–269 (2020)
Wu, R., Zhuang, Y., Xu, K., Zhang, H., Chen, B.: PQ-NET: a generative part Seq2Seq network for 3D shapes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 829–838 (2020)
Xie, J., et al.: Learning descriptor networks for 3D shape synthesis and analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8629–8638 (2018)
Nozawa, N., Shum, H.P.H., Feng, Q., Ho, E.S.L., Morishima, S.: 3D car shape reconstruction from a contour sketch using GAN and lazy learning. Vis. Comput. 38(4), 1317–1330 (2022). https://doi.org/10.1007/s00371-020-02024-y
Wu, Z., et al.: Sagnet: structure-aware generative network for 3D-shape modeling. In: ACM Transactions Graphic Proceedings of SIGGRAPH 2019, vol. 38, no. 4, pp. 91:1–91:14 (2019)
Achlioptas, P., Diamanti, O., Mitliagkas, I., Guibas, L.: Learning representations and generative models for 3D point clouds. In: International Conference on Machine Learning, pp. 40–49. PMLR (2018)
Shu, D.W., Park, S.W., Kwon, J.: 3D point cloud generative adversarial network based on tree structured graph convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3859–3868 (2019)
Li, C.-L., Zaheer, M., Zhang, Y., Poczos, B., Salakhutdinov, R.: Point cloud GAN, arXiv preprint arXiv:1810.05795 (2018)
Zamorski, M., et al.: Adversarial autoencoders for compact representations of 3D point clouds. In: Computer Vision and Image Understanding, vol. 193, p. 102921 (2020)
Gal, R., Bermano, A., Zhang, H., Cohen-Or, D.: MRGAN: multi-rooted 3D shape generation with unsupervised part disentanglement, arXiv preprint arXiv:2007.12944 (2020)
Ramasinghe, S., Khan, S., Barnes, N., Gould, S.: Spectral-GANs for high-resolution 3D point-cloud generation. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 8169–8176. IEEE (2020)
Li, R., Li, X., Hui, K.-H., Fu, C.-W.: SP-GAN: sphere-guided 3D shape generation and manipulation. ACM Trans. Graph. (TOG) 40(4), 1–12 (2021)
Wang, N., et al.: Pixel2mesh: generating 3D mesh models from single RGB images. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)
Wen, C., Zhang, Y., Li, Z., Fu, Y.: Pixel2mesh++: multi-view 3D mesh generation via deformation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1042–1051 (2019)
Lv, C., Lin, W., Zhao, B.: Voxel structurebased mesh reconstruction from a 3D point cloud. IEEE Trans. Multimedia 24, 1815–1829 (2021)
Deng, Z., et al.: Fast 3D face reconstruction from a single image combining attention mechanism and graph convolutional network. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02679-9
Brocchini, M., et al.: Monster: a deep learning-based system for the automatic generation of gaming assets. In: Mazzeo, P.L., Frontoni, E., Sclaroff, S., Distante, C. (eds.) ICIAP 2022. LNCS, vol. 13373, pp. 280–290. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-13321-3_25
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition, arXiv arXiv:1409.1556 (2014)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Ravi, N., et al.: Accelerating 3D deep learning with PyTorch3D, arXiv:2007.08501 (2020)
Fey, M., Lenssen, J.E.: Fast graph representation learning with PyTorch geometric. In: ICLR Workshop on Representation Learning on Graphs and Manifolds (2019)
Gao, J., et al.: GET3D: a generative model of high quality 3D textured shapes learned from images (2022). arXiv: 2209.11163
Qian, G., et al.: Magic123: one image to high quality 3D object generation using both 2D and 3D diffusion priors (2023). arXiv: 2306.17843
Kim, K.-S., Zhang, D., Kang, M.-C., Ko, S.-J.: Improved simple linear iterative clustering superpixels. In: 2013 IEEE International Symposium on Consumer Electronics (ISCE), pp. 259–260 (2013). https://doi.org/10.1109/ISCE.2013.6570216
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Mameli, M., Balloni, E., Mancini, A., Frontoni, E., Zingaretti, P. (2024). Investigation on the Encoder-Decoder Application for Mesh Generation. In: Sheng, B., Bi, L., Kim, J., Magnenat-Thalmann, N., Thalmann, D. (eds) Advances in Computer Graphics. CGI 2023. Lecture Notes in Computer Science, vol 14496. Springer, Cham. https://doi.org/10.1007/978-3-031-50072-5_31
Download citation
DOI: https://doi.org/10.1007/978-3-031-50072-5_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50071-8
Online ISBN: 978-3-031-50072-5
eBook Packages: Computer ScienceComputer Science (R0)