WalkFormer: 3D mesh analysis via transformer on random walk

Guo, Qing; He, Fazhi; Fan, Bo; Song, Yupeng; Dai, Jicheng; Fan, Linkun

doi:10.1007/s00521-023-09279-1

WalkFormer: 3D mesh analysis via transformer on random walk

Original Article
Published: 04 December 2023

Volume 36, pages 3499–3511, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Qing Guo¹,
Fazhi He ORCID: orcid.org/0000-0001-7016-3698^1,2,
Bo Fan³,
Yupeng Song¹,
Jicheng Dai¹ &
…
Linkun Fan¹

457 Accesses
2 Citations
Explore all metrics

Abstract

A 3D mesh is a popular representation of 3D shapes. For mesh analysis tasks, one typical method is to map 3D mesh data into 1D sequence data with random walk sampling. However, existing random walk-based approaches cannot make full use of attentive regions, which limits the capability of 3D shape analysis. In addition, existing methods process the random walk as sequence data in the discovery order, which results in computational overhead. In this paper, we propose a novel neural framework named WalkFormer, which applies a transformer to a random walk to fully exploit semantic information in a 3D mesh. First, we propose a novel transformer-based framework to learn semantic information from a random walk of a 3D mesh. Second, to capture the attentive regions of the random walk, our approach extends the multi-head self-attention mechanism to specific 3D mesh analysis tasks. To establish the long-range interactions between vertices in the random walk, our approach adopts a novel relative position encoding module. Thus, the local–global information in the random walk can be obtained and learned in our approach. Third, we discover that for 3D mesh analysis, the sequential operations for the random walk sequence are redundant. Different from previous random walk methods, our approach can be executed in a parallelized manner, which greatly improves computational efficiency. Numerous experiments demonstrate the effectiveness of the proposed method on typical 3D shape analysis tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Laplacian Mesh Transformer: Dual Attention and Topology Aware Network for 3D Mesh Classification and Segmentation

MeT: mesh transformer with an edge

Article 14 July 2023

MEAN: An attention-based approach for 3D mesh shape classification

Article 16 July 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Abouelaziz I, Chetouani A, El Hassouni M, Latecki LJ, Cherifi H (2020) 3d visual saliency and convolutional neural network for blind mesh quality assessment. Neural Comput Appl 32(21):16589–16603
Google Scholar
Kwon S, Kim BC, Mun D, Han S (2015) Simplification of feature-based 3D CAD assembly data of ship and offshore equipment using quantitative evaluation metrics. Comput Aided Des 59:140–154
Google Scholar
Lin B, Wang F, Zhao F, Sun Y (2018) Scale invariant point feature (SIPF) for 3D point clouds and 3d multi-scale object detection. Neural Comput Appl 29(5):1209–1224
Google Scholar
Kim BC, Mun D (2014) Feature-based simplification of boundary representation models using sequential iterative volume decomposition. Comput Graph 38:97–107
Google Scholar
Wang Y, Horvath I (2013) Computer-aided multi-scale materials and product design. Comput Aided Des 45(1):1–3
MathSciNet Google Scholar
Rosen DW, Jeong N, Wang Y (2013) A method for reverse engineering of material microstructure for heterogeneous cad. Comput Aided Des 45(7):1068–1078
Google Scholar
Wang W, Cai Y, Wang T (2022) Multi-view dual attention network for 3D object recognition. Neural Comput Appl 34(4):3201–3212
Google Scholar
Hanocka R, Hertz A, Fish N, Giryes R, Fleishman S, Cohen-Or D (2019) Meshcnn: a network with an edge. ACM Trans Graph 38(4):1–12
Google Scholar
Tang W, He F, Liu Y, Duan Y (2022) MATR: multimodal medical image fusion via multiscale adaptive transformer. IEEE Trans Image Process 31:5134–5149
ADS PubMed Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Google Scholar
Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE (2017) A survey of deep neural network architectures and their applications. Neurocomputing 234:11–26
Google Scholar
Zhao T, Chen Q, Kuang Z, Yu J, Zhang W, Fan J (2018) Deep mixture of diverse experts for large-scale visual recognition. IEEE Trans Pattern Anal Mach Intell 41(5):1072–1087
PubMed Google Scholar
Zhang H, Sun Y, Liu L, Wang X, Li L, Liu W (2020) Clothingout: a category-supervised GAN model for clothing segmentation and retrieval. Neural Comput Appl 32(9):4519–4530
Google Scholar
El-Bana S, Al-Kabbany A, Sharkas M (2020) A two-stage framework for automated malignant pulmonary nodule detection in CT scans. Diagnostics 10(3):131
PubMed PubMed Central Google Scholar
Zhang H, Ji Y, Huang W, Liu L (2019) Sitcom-star-based clothing retrieval for video advertising: a deep learning framework. Neural Comput Appl 31(11):7361–7380
Google Scholar
Hussein A, Elyan E, Gaber MM, Jayne C (2018) Deep imitation learning for 3D navigation tasks. Neural Comput Appl 29(7):389–404
PubMed Google Scholar
El-Bana S, Al-Kabbany A, Sharkas M (2020) A multi-task pipeline with specialized streams for classification and segmentation of infection manifestations in covid-19 scans. PeerJ Comput Sci 6:303
Google Scholar
Zhang X, Zhao W, Zhang W, Peng J, Fan J (2022) Guided filter network for semantic image segmentation. IEEE Trans Image Process 31:2695–2709
ADS PubMed Google Scholar
Wang S, Chen Z, You S, Wang B, Shen Y, Lei B (2022) Brain stroke lesion segmentation using consistent perception generative adversarial network. Neural Comput Appl 34(11):8657–8669
Google Scholar
Wu H, He F, Duan Y, Yan X (2022) Perceptual metric-guided human image generation. Integr Comput Aided Eng 29(2):141–151
Google Scholar
Hu S-M, Liu Z-N, Guo M-H, Cai J-X, Huang J, Mu T-J, Martin RR (2022) Subdivision-based mesh convolution networks. ACM Trans Graph 41(3):1–16
Google Scholar
Lahav A, Tal A (2020) Meshwalker deep mesh understanding by random walks. ACM Trans Graph 39(6):1–13
Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
CAS PubMed Google Scholar
Izhak RB, Lahav A, Tal A (2022) Attwalk: attentive cross-walks for deep mesh analysis. In: 2022 IEEE/CVF winter conference on applications of computer vision (WACV), pp 2937–2946. IEEE
Mesika A, Ben-Shabat Y, Tal A (2022) Cloudwalker: random walks for 3D point cloud shape analysis. Comput Graph 106:110–118
Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30:5998–6008
Google Scholar
Shaw P, Uszkoreit J, Vaswani A (2018) Self-attention with relative position representations. In: NAACL-HLT (2), pp 464–468
Ahmed E, Saint A, Shabayek AER, Cherenkova K, Das R, Gusev G, Aouada D, Ottersten B (2018) A survey on deep learning advances on different 3D data representations. arXiv preprint arXiv:1808.01462
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp 945–953
Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view CNNs for object classification on 3d data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5648–5656
Yang Y, Chen F, Wu F, Zeng D, Ji Y-M, Jing X-Y (2020) Multi-view semantic learning network for point cloud based 3D object detection. Neurocomputing 397:477–485
Google Scholar
Qin P, Zhang C, Dang M (2022) Gvnet: Gaussian model with voxel-based 3d detection network for autonomous driving. Neural Comput Appl 34(9):6637–6645
Google Scholar
Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 652–660
Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. Adv Neural Inf Process Syst 30:5099–5108
Google Scholar
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph CNN for learning on point clouds. ACM Trans Graph 38(5):1–12
Google Scholar
Wang H, Liu X, Kang W, Yan Z, Wang B, Ning Q (2022) Multi-features guidance network for partial-to-partial point cloud registration. Neural Comput Appl 34(2):1623–1634
CAS Google Scholar
Milano F, Loquercio A, Rosinol A, Scaramuzza D, Carlone L (2020) Primal-dual mesh convolutional neural networks. Adv Neural Inf Process Syst 33:952–963
Google Scholar
Feng Y, Feng Y, You H, Zhao X, Gao Y (2019) Meshnet: mesh neural network for 3D shape representation. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 8279–8286
Yi L, Su H, Guo X, Guibas LJ (2017) Syncspeccnn: synchronized spectral CNN for 3D shape segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2282–2290
Kostrikov I, Jiang Z, Panozzo D, Zorin D, Bruna J (2018) Surface networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2540–2548
Such FP, Sah S, Dominguez MA, Pillai S, Zhang C, Michael A, Cahill ND, Ptucha R (2017) Robust spatial filtering with graph convolutional neural networks. IEEE J Sel Top Signal Process 11(6):884–896
ADS Google Scholar
Verma N, Boyer E, Verbeek J (2018) Feastnet: Feature-steered graph convolutions for 3d shape analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2598–2606
Lim I, Dielen A, Campen M, Kobbelt L (2018) A simple approach to intrinsic correspondence learning on unstructured 3d meshes. In: Proceedings of the European conference on computer vision (ECCV) workshops, vol. 11131, pp 349–362
Gong S, Chen L, Bronstein M, Zafeiriou S (2019) Spiralnet++: A fast and highly efficient mesh convolution operator. In: Proceedings of the IEEE/CVF international conference on computer vision workshops, pp 4141–4148
Chen Y, Zhao J, Shi C, Yuan D (2020) Mesh convolution: a novel feature extraction method for 3d nonrigid object classification. IEEE Trans Multimed 23:3098–3111
Google Scholar
Schult J, Engelmann F, Kontogianni T, Leibe B (2020) Dualconvmesh-net: Joint geodesic and euclidean convolutions on 3d meshes. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8612–8622
Lai YK, Hu SM, Martin RR, Rosin PL (2008) Fast mesh segmentation using random walks. In: Proceedings of the 2008 ACM symposium on solid and physical modeling, pp 183–191
Grady L (2006) Random walks for image segmentation. IEEE Trans Pattern Anal Mach Intell 28(11):1768–1783
PubMed Google Scholar
Schneider L, Niemann A, Beuing O, Preim B, Saalfeld S (2021) Medmeshcnn-enabling meshcnn for medical surface models. Comput Methods Programs Biomed 210:106372
PubMed Google Scholar
Liu H-TD, Kim VG, Chaudhuri S, Aigerman N, Jacobson A (2020) Neural subdivision. ACM Trans Graph 39(4):124
Google Scholar
Guo K, Zou D, Chen X (2015) 3d mesh labeling via deep convolutional neural networks. ACM Trans Graph 35(1):1–12
CAS Google Scholar
Singh VV, Sheshappanavar SV, Kambhamettu C (2021) Meshnet++: A network with a face. In: Proceedings of the 29th ACM international conference on multimedia, pp 4883–4891
Wang Y, Xie Y, Fan L, Hu G (2022) Stmg: Swin transformer for multi-label image recognition with graph convolution network. Neural Comput Appl 34(12):10051–10063
Google Scholar
Kalyan KS, Rajasekharan A, Sangeetha S (2021) Ammus: a survey of transformer-based pretrained models in natural language processing. arXiv preprint arXiv:2108.05542
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly Sa (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: International conference on learning representations
Guo M-H, Cai J-X, Liu Z-N, Mu T-J, Martin RR, Hu S-M (2021) Pct: Point cloud transformer. Comput Vis Media 7(2):187–199
Google Scholar
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision, pp 213–229
Xie E, Wang W, Yu Z, Anandkumar A, Alvarez JM, Luo P (2021) Segformer: simple and efficient design for semantic segmentation with transformers. Adv Neural Inf Process Syst 34:12077–12090
Google Scholar
Lin K, Wang L, Liu Z (2021) End-to-end human pose and mesh reconstruction with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1954–1963
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Ulyanov D, Vedaldi A, Lempitsky V (2016) Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1912–1920
Lian Z, Godil A, Bustos B, Daoudi M, Hermans J, Kawamura S, Kurita Y, Lavoué G, Van Nguyen H, Ohbuchi R (2011) Shrec’11 track: Shape retrieval on non-rigid 3d watertight meshes. In: 3DOR@ Eurographics, pp 79–88
Wang Y, Asafi S, Van Kaick O, Zhang H, Cohen-Or D, Chen B (2012) Active co-analysis of a set of shapes. ACM Trans Graph 31(6):1–10
Google Scholar
Maron H, Galun M, Aigerman N, Trope M, Dym N, Yumer E, Kim VG, Lipman Y (2017) Convolutional neural networks on surfaces via seamless toric covers. ACM Trans Graph 36(4):1–10
Google Scholar
Sharp N, Attaiki S, Crane K, Ovsjanikov M (2022) Diffusionnet: discretization agnostic learning on surfaces. ACM Trans Graph 41(3):1–16
Google Scholar
Smirnov D, Solomon J (2021) Hodgenet: learning spectral geometry on triangle meshes. ACM Trans Graph 40(4):1–11
Google Scholar
Ezuz D, Solomon J, Kim VG, Ben-Chen M (2017) Gwcnn: a metric alignment layer for deep shape analysis. Comput Graph Forum 36(5):49–57
Google Scholar
Haim N, Segol N, Ben-Hamu H, Maron H, Lipman Y (2019) Surface networks via general covers. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 632–641
Anguelov D, Srinivasan P, Koller D, Thrun S, Rodgers J, Davis J (2005) Scape: shape completion and animation of people. In: ACM SIGGRAPH 2005 papers, pp 408–416
Bogo F, Romero J, Loper M, Black MJ (2014) Faust: dataset and evaluation for 3dD mesh registration. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3794–3801
Vlasic D, Baran I, Matusik W, Popović J (2008) Articulated mesh animation from multi-view silhouettes. In: ACM SIGGRAPH 2008 Papers, pp 1–9
Adobe (2016) Adobe fuse 3D characters. https://www.mixamo.com
Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) Pointcnn: convolution on x-transformed points. Adv Neural Inf Process Syst 31:828–838
Google Scholar
Su J, Lu Y, Pan S, Wen B, Liu Y (2021) Roformer: enhanced transformer with rotary position embedding. arXiv preprint arXiv:2104.09864

Download references

Funding

This work was supported by the National Natural Science Foundation of China under Grant 62072348 and China Yunnan province major science and technology special plan project No. 202202AF080004. The numerical calculations in this paper have been done on the supercomputing system in the Supercomputing Center of Wuhan University.

Author information

Authors and Affiliations

School of Computer Science, Wuhan University, Wuhan, 430072, China
Qing Guo, Fazhi He, Yupeng Song, Jicheng Dai & Linkun Fan
National Engineering Research Center for Multimedia Software, Wuhan, 430072, China
Fazhi He
Institute of science and technology development, Wuhan University, Wuhan, 430072, China
Bo Fan

Authors

Qing Guo
View author publications
You can also search for this author inPubMed Google Scholar
Fazhi He
View author publications
You can also search for this author inPubMed Google Scholar
Bo Fan
View author publications
You can also search for this author inPubMed Google Scholar
Yupeng Song
View author publications
You can also search for this author inPubMed Google Scholar
Jicheng Dai
View author publications
You can also search for this author inPubMed Google Scholar
Linkun Fan
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Fazhi He.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Guo, Q., He, F., Fan, B. et al. WalkFormer: 3D mesh analysis via transformer on random walk. Neural Comput & Applic 36, 3499–3511 (2024). https://doi.org/10.1007/s00521-023-09279-1

Download citation

Received: 29 October 2022
Accepted: 06 November 2023
Published: 04 December 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s00521-023-09279-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

WalkFormer: 3D mesh analysis via transformer on random walk

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Laplacian Mesh Transformer: Dual Attention and Topology Aware Network for 3D Mesh Classification and Segmentation

MeT: mesh transformer with an edge

MEAN: An attention-based approach for 3D mesh shape classification

Explore related subjects

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now