Abstract
3D shape processing is a fundamental computer application. Specifically, 3D mesh could provide a natural and detailed way for object representation. However, due to its non-uniform and irregular data structure, applying deep learning technologies to 3D mesh is difficult. Furthermore, previous deep learning approaches for 3D mesh mainly focus on local structural features and there is a loss of information. In this paper, to make better mesh shape awareness, a novel deep learning approach is proposed, which aims to full-use the information of mesh data and exploit comprehensive features for more accurate classification. To utilize self-attention mechanism and learn global features of mesh edges, we propose a novel attention-based structure with the edge attention module. Then, for local feature learning, our model aggregates edge features from adjacent edges. We refine the network by discarding pooling layers for efficiency. Thus, it captures comprehensive features from both local and global fields for better shape awareness. Moreover, we adopt spatial position encoding module based on spatial information of edges to enhance the model to better recognize edges and make full use of mesh data. We demonstrate effectiveness of our model in classification tasks with numerous experiments which show outperforming results on popular datasets.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Wang, K., Zhang, G., Yang, J., Bao, H.: Dynamic human body reconstruction and motion tracking with low-cost depth cameras. Vis. Comput. 37(3), 603–618 (2021)
Haouchine, N., Roy, F., Courtecuisse, H., Niebner, M., Cotin, S.: Physics-based image and video editing through cad model proxies. Vis. Comput. 36(1), 211–226 (2020)
Zabulis, X., Lourakis, M.I.A., Koutlemanis, P.: Correspondence-free pose estimation for 3d objects from noisy depth data. Vis. Comput. 34(2), 193–211 (2018)
Hua, H., Jia, T.: Wire cut of double-sided minimal surfaces. Visu. Comput. 34(6-8, SI), 985–995 (2018). In: 35th Computer Graphics International Conference (CGI), Indonesia, 11–14 June, 2018
Yeo, C., Kim, B.C., Cheon, S., Lee, J., Mun, D.: Machining feature recognition based on deep neural networks to support tight integration with 3d cad systems. Sci. Rep. 11(1), 22147 (2021)
Regli, W., Rossignac, J., Shapiro, V., Srinivasan, V.: The new frontiers in computational modeling of material structures. Comput. Aided Des. 77, 73–85 (2016)
Liang, Y., He, F., Zeng, X., Luo, J.: An improved loop subdivision to coordinate the smoothness and the number of faces via multi-objective optimization. Integr. Comput. Aided Eng. 29(1), 23–41 (2022)
Hai, W., Jain, N., Wydra, A., Thalmann, N.M., Thalmann, D.: Time-scaled interactive object-driven multi-party vr. Vis. Comput. 34(6-8, SI), 887–897 (2018). In: 35th Computer Graphics International Conference (CGI), Indonesia, 11–14 June, 2018
Diaz, J., Ropinski, T., Navazo, I., Gobbetti, E., Vazquez, P.-P.: An experimental study on the effects of shading in 3d perception of volumetric models. Vis. Comput. 33(1), 47–61 (2017)
Wang, Y., Rosen, D.W.: Multiscale heterogeneous modeling with surfacelets. Comput. Aided Des. Appl. 7(5), 759–776 (2010)
Buonamici, F., Furferi, R., Governi, L., Lazzeri, S., McGreevy, K.S., Servi, M., Talanti, E., Uccheddu, F., Volpe, Y.: A practical methodology for computer-aided design of custom 3d printable casts for wrist fractures. Vis. Comput. 36(2), 375–390 (2020)
Kim, Y., Kwon, K., Mun, D.: Mesh-offset-based method to generate a delta volume to support the maintenance of partially damaged parts through 3d printing. J. Mech. Sci. Technol. 35(7), 3131–3143 (2021)
Bukenberger, D.R., Lensch, H.P.A.: Be water my friend: mesh assimilation. Vis. Comput. 37, 2725–2739 (2021). ((9-11, SI))
Xiao, D., Lin, H., Xian, C., Gao, S.: Cad mesh model segmentation by clustering. Comput. Graph. 35(3), 685–691 (2011)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Song, Y., He, F., Duan, Y., Liang, Y., Yan, X.: A kernel correlation-based approach to adaptively acquire local features for learning 3d point clouds. Comput. Aided Des. 146, 103196 (2022)
Maturana, D., Scherer, S.: Voxnet: a 3d convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928 (2015)
Hanocka, R., Hertz, A., Fish, N., Giryes, R., Fleishman, S., Cohen-Or, D.: Meshcnn: a network with an edge. ACM Trans. Graph. (TOG) 38(4), 1–12 (2019)
Schneider, L., Niemann, A., Beuing, O., Preim, B., Saalfeld, S.: Medmeshcnn-enabling meshcnn for medical surface models. Comput. Methods Programs Biomed. 210, 106372 (2021)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S, et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Guo, M.H., Cai, J.X., Liu, Z.N., Mu, T.J., Martin, R.R., Hu, S.M.: Pct: point cloud transformer. Comput Vis Media 7(2), 187–199 (2021)
Yu, X., Tang, L., Rao, Y., Huang, T., Zhou, J., Lu, J.: Point-bert: pre-training 3d point cloud transformers with masked point modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 19313–19322 (2022)
Wang, P., Gan, Y., Shui, P., Fenggen, Yu., Zhang, Y., Chen, S., Sun, Z.: 3d shape segmentation via shape fully convolutional networks. Comput. Graph. 70, 128–139 (2018)
Verma, N., Boyer, E., Verbeek, J.: Feastnet: feature-steered graph convolutions for 3d shape analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Qiao, Y.-L., Gao, L., Yang, J., Rosin, P.L., Lai, Y.-K., Chen, X.: Learning on 3d meshes with Laplacian encoding and pooling. IEEE Trans. Vis. Comput. Graph. 28(2), 1317–1327 (2022)
Dong, Q., Wang, Z., Li, M., Gao, J., Chen, S., Shu, Z., Xin, S., Tu, C., Wang, W.: Laplacian2mesh: Laplacian-based mesh understanding. arXiv preprint arXiv:2202.00307 (2022)
Li, X.-J., Yang, J., Zhang, F.-L.: Laplacian mesh transformer: dual attention and topology aware network for 3d mesh classification and segmentation. In: Computer Vision - ECCV, pp. 541–560 (2022)
Ruizhongtai, Q.C., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, 4–9 Dec, 2017, Long Beach, CA, USA, pp. 5099–5108 (2017)
Chen, Yu., Zhao, J., Shi, C., Yuan, D.: Mesh convolution: a novel feature extraction method for 3d nonrigid object classification. IEEE Trans. Multimed. 23, 3098–3111 (2020)
Sharp, N., Attaiki, S., Crane, K., Ovsjanikov, M.: Diffusionnet: discretization agnostic learning on surfaces. ACM Trans. Graph. (TOG) 41(3), 1–16 (2022)
Smirnov, D., Solomon, J.: Hodgenet: learning spectral geometry on triangle meshes. ACM Trans. Graph. (TOG) 40(4), 1–11 (2021)
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)
Lahav, A., Tal, A.: Meshwalker: deep mesh understanding by random walks. ACM Trans. Graph. (TOG) 39(6), 1–13 (2020)
Ben Izhak, R., Lahav, A., Tal, A.: Attwalk: attentive cross-walks for deep mesh analysis. In: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 2937–2946. IEEE (2022)
Milano, F., Loquercio, A., Rosinol, A., Scaramuzza, D., Carlone, L.: Primal-dual mesh convolutional neural networks. Adv. Neural Inf. Process. Syst. 33, 952–963 (2020)
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
Lian, C., Wang, L., Wu, T.H., Liu, M., Durain, F., Ko, C.C., Shen, D.: Meshsnet: deep multi-scale mesh feature learning for end-to-end tooth labeling on 3d dental surfaces. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 837–845. Springer (2019)
Feng, Y., Feng, Y., You, H., Zhao, X., Gao, Y.: Meshnet: mesh neural network for 3d shape representation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8279–8286 (2019)
Xu, H., Dong, M., Zhong, Z.: Directionally convolutional networks for 3d shape segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2698–2707 (2017)
Hu, S.M., Liu, Z.N., Guo, M.H., Cai, J.X., Huang, J., Mu, T.J., Martin, R.R.: Subdivision-based mesh convolution networks. ACM Trans. Graph. (TOG) 41(3), 1–16 (2022)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser Ł, Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Devlin, J., Chang, M.W., Lee, K. and Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Si, T., He, F., Zhang, Z. and Duan, Y.: Hybrid contrastive learning for unsupervised person re-identification. IEEE Trans. Multimed. (2022)
Tang, W., He, F., Liu, Y.: Ydtr: infrared and visible image fusion via y-shape dynamic transformer. IEEE Trans. Multimed. (2022)
Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., Zhou, Y.: Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: Segformer: simple and efficient design for semantic segmentation with transformers. Adv. Neural Inf. Process. Syst. 34, 12077–12090 (2021)
Chaudhari, S., Mithal, V., Polatkan, G., Ramanath, R.: An attentive survey of attention models. ACM Trans. Intell. Syst. Technol. (TIST) 12(5), 1–32 (2021)
Niu, Z., Zhong, G., Hui, Yu.: A review on the attention mechanism of deep learning. Neurocomputing 452, 48–62 (2021)
Zhao, H., Jiang, L., Jia, J., Torr, P.H., Koltun, V.: Point transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16259–16268 (2021)
Lin, K., Wang, L. and Liu, Z.: End-to-end human pose and mesh reconstruction with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1954–1963 (2021)
Lin, K., Wang, L., Liu, Z.: Mesh graphormer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12939–12948 (2021)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Lian, Z., Godil, A., Bustos, B., Daoudi, M., Hermans, J., Kawamura, S., Kurita, Y., Lavoue G., Van Nguyen, H., Ohbuchi, R, et al. Shrec’11 track: shape retrieval on non-rigid 3d watertight meshes. In: 3DOR@ Eurographics, pp. 79–88 (2011)
Ezuz, D., Solomon, J., Kim, V.G., Ben-Chen, M.: Gwcnn: a metric alignment layer for deep shape analysis. In: Computer Graphics Forum, vol. 36, pp. 49–57. Wiley, Hoboken (2017)
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Huang, J., Su, H., Guibas, L.: Robust watertight manifold surface generation method for shapenet models. arXiv preprint arXiv:1802.01698 (2018)
Lee, A.W., Sweldens, W., Schröder, P., Cowsar, L. and Dobkin, D.: Maps: multiresolution adaptive parameterization of surfaces. In: Proceedings of the 25th annual Conference on Computer Graphics and Interactive Techniques, pp. 95–104 (1998)
Zhang, J., He, F., Duan, Y., Yang, S.: Aidednet: anti-interference and detail enhancement dehazing network for real-world scenes. Front. Comput. Sci. 17(2), 172703 (2023)
Zhang, Y., Yin, C., Qilin, W., He, Q., Zhu, H.: Location-aware deep collaborative filtering for service recommendation. IEEE Trans. Syst. Man Cybern. Syst. 51(6), 3796–3807 (2021)
Mancini, S., Stecca, G.: A large neighborhood search based matheuristic for the tourist cruises itinerary planning. Comput. Ind. Eng. 122, 140–148 (2018)
Li, S., Zhang, D., Xian, Y., Li, B., Zhang, T., Zhong, C. Parameter-adaptive multi-frame joint pose optimization method. Vis. Comput
Funding
This work is supported by the National Natural Science Foundation of China (Grant No.62072348). The numerical calculations in this paper have been done on the supercomputing system in the Supercomputing Center of Wuhan University.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dai, J., Fan, R., Song, Y. et al. MEAN: An attention-based approach for 3D mesh shape classification. Vis Comput 40, 2987–3000 (2024). https://doi.org/10.1007/s00371-023-03003-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-03003-9