Global Patch Cross-Attention for Point Cloud Analysis

Tao, Manli; Zhao, Chaoyang; Wang, Jinqiao; Tang, Ming

doi:10.1007/978-3-031-18913-5_8

Manli Tao ORCID: orcid.org/0000-0002-7484-8173^15,16,
Chaoyang Zhao¹⁶,
Jinqiao Wang^15,16 &
…
Ming Tang^15,16

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13536))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1969 Accesses

Abstract

Despite the great achievement on 3D point cloud analysis with deep learning methods, the insufficiency of contextual semantic description, and misidentification of confusing objects remain tricky problems. To address these challenges, we propose a novel approach, Global Patch Cross-Attention Network (GPCAN), to learn more discriminative point cloud features effectively. Specifically, a global patch construction module is developed to generate global patches which share holistic shape similarity but hold diversity in local structure. Then the local features are extracted from both the original point cloud and these global patches. Further, a transformer-style cross-attention module is designed to model cross-object relations, which are all point-pair attentions between the original point cloud and each global patch, for learning the context-dependent features of each global patch. In this way, our method can integrate the features of original point cloud with both the local features and global contexts in each global patch for enhancing the discriminative power of the model. Extensive experiments on challenging point cloud classification and part segmentation benchmarks verify that our GPCAN achieves the state-of-the-arts on both synthetic and real-world datasets.

This work was supported by Key-Area Research and Development Program of Guangdong Province (No. 2021B0101410003), National Natural Science Foundation of China under Grants 61976210, 62176254, 62006230, 62002357 and 61876086.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

PointFormer: A Dual Perception Attention-Based Network for Point Cloud Classification

Local-non-local complementary learning network for 3D point cloud analysis

Article Open access 02 January 2025

RS-TNet: point cloud transformer with relation-shape awareness for fine-grained 3D visual processing

Article 02 November 2022

References

Bello, I., Zoph, B., Vaswani, A., Shlens, J., Le, Q.V.: Attention augmented convolutional networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3286–3295 (2019)
Google Scholar
Ben-Shabat, Y., Lindenbaum, M., Fischer, A.: 3DmFV: three-dimensional point cloud classification in real-time using convolutional neural networks. IEEE Robot. Autom. Lett. 3(4), 3145–3152 (2018)
Article Google Scholar
Feng, Y., Zhang, Z., Zhao, X., Ji, R., Gao, Y.: GVCNN: Group-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 264–272 (2018)
Google Scholar
Gadelha, M., Wang, R., Maji, S.: Multiresolution tree networks for 3D point cloud processing. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 103–118 (2018)
Google Scholar
Guerrero, P., Kleiman, Y., Ovsjanikov, M., Mitra, N.J.: PCPNet learning local shape properties from raw point clouds. In: Computer Graphics Forum, vol. 37, pp. 75–85. Wiley Online Library (2018)
Google Scholar
Guo, H., Wang, J., Gao, Y., Li, J., Lu, H.: Multi-view 3D object retrieval with deep embedding network. IEEE Trans. Image Process. 25(12), 5526–5537 (2016)
Article MathSciNet Google Scholar
Guo, M.H., Cai, J.X., Liu, Z.N., Mu, T.J., Martin, R.R., Hu, S.M.: PCT: Point cloud transformer. Computational Visual Media 7(2), 187–199 (2021)
Article Google Scholar
Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. 4(2), 251–257 (1991)
Article MathSciNet Google Scholar
Klokov, R., Lempitsky, V.: Escape from cells: Deep Kd-networks for the recognition of 3d point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 863–872 (2017)
Google Scholar
Landrieu, L., Simonovsky, M.: Large-scale point cloud semantic segmentation with superpoint graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4558–4567 (2018)
Google Scholar
Lee, J., Lee, Y., Kim, J., Kosiorek, A., Choi, S., Teh, Y.W.: Set transformer: A framework for attention-based permutation-invariant neural networks. In: International Conference on Machine Learning, pp. 3744–3753. PMLR (2019)
Google Scholar
Li, J., Chen, B.M., Lee, G.H.: So-Net: Self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9397–9406 (2018)
Google Scholar
Li, R., Wang, S., Zhu, F., Huang, J.: Adaptive graph convolutional neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: PointCNN: Convolution on x-transformed points. Adv. Neural. Inf. Process. Syst. 31, 820–830 (2018)
Google Scholar
Lin, C., Li, C., Liu, Y., Chen, N., Choi, Y.K., Wang, W.: Point2skeleton: Learning skeletal representations from point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4277–4286 (2021)
Google Scholar
Lin, Z.H., Huang, S.Y., Wang, Y.C.F.: Convolution in the cloud: Learning deformable kernels in 3D graph convolution networks for point cloud analysis. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1797–1806. IEEE (2020)
Google Scholar
Liu, X., Han, Z., Liu, Y.S., Zwicker, M.: Point2sequence: Learning the shape representation of 3D point clouds with an attention-based sequence to sequence network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8778–8785 (2019)
Google Scholar
Liu, Y., Fan, B., Xiang, S., Pan, C.: Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8895–8904 (2019)
Google Scholar
Mao, J., Wang, X., Li, H.: Interpolated convolutional networks for 3d point cloud understanding. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1578–1587. IEEE Computer Society (2019)
Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Google Scholar
Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view cnns for object classification on 3D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5648–5656 (2016)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: Deep hierarchical feature learning on point sets in a metric space. Adv. Neural Inform. Process. Syst. 30 5104–5144 (2017)
Google Scholar
Qiu, S., Anwar, S., Barnes, N.: Dense-resolution network for point cloud classification and segmentation. In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 3812–3821. IEEE (2021)
Google Scholar
Ramachandran, P., Parmar, N., Vaswani, A., Bello, I., Levskaya, A., Shlens, J.: Stand-alone self-attention in vision models. Adv. Neural Inform. Process. Syst. 32 68–80 (2019)
Google Scholar
Shen, Y., Feng, C., Yang, Y., Tian, D.: Mining point cloud local structures by kernel correlation and graph pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4548–4557 (2018)
Google Scholar
Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 945–953 (2015)
Google Scholar
Te, G., Hu, W., Zheng, A., Guo, Z.: RGCNN: Regularized graph CNN for point cloud segmentation. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 746–754 (2018)
Google Scholar
Thomas, H., Qi, C.R., Deschaud, J.E., Marcotegui, B., Goulette, F., Guibas, L.: KPConv: Flexible and deformable convolution for point clouds. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6410–6419. IEEE Computer Society (2019)
Google Scholar
Uy, M.A., Pham, Q.H., Hua, B.S., Nguyen, T., Yeung, S.K.: Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 1588–1597 (2019)
Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Wang, C., Samari, B., Siddiqi, K.: Local spectral graph convolution for point set feature learning. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 52–66 (2018)
Google Scholar
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)
Google Scholar
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Transactions On Graphics (tog) 38(5), 1–12 (2019)
Article Google Scholar
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Google Scholar
Xie, J., Dai, G., Zhu, F., Wong, E.K., Fang, Y.: Deepshape: deep-learned shape descriptor for 3D shape retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 39(7), 1335–1345 (2016)
Article Google Scholar
Xie, S., Liu, S., Chen, Z., Tu, Z.: Attentional shapecontextnet for point cloud recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4606–4615 (2018)
Google Scholar
Xu, M., Zhou, Z., Qiao, Y.: Geometry sharing network for 3D point cloud classification and segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12500–12507 (2020)
Google Scholar
Xu, Y., Fan, T., Xu, M., Zeng, L., Qiao, Y.: Spidercnn: Deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 87–102 (2018)
Google Scholar
Yan, X., Zheng, C., Li, Z., Wang, S., Cui, S.: PointASNL: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5589–5598 (2020)
Google Scholar
Yang, J., et al.: Modeling point clouds with self-attention and gumbel subset sampling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3323–3332 (2019)
Google Scholar
Yi, L., et al.: A scalable active framework for region annotation in 3D shape collections. ACM Transactions on Graphics (ToG) 35(6), 1–12 (2016)
Article Google Scholar
Zhao, H., Jia, J., Koltun, V.: Exploring self-attention for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10076–10085 (2020)
Google Scholar
Zhao, H., Jiang, L., Fu, C.W., Jia, J.: PointWeb: Enhancing local neighborhood features for point cloud processing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5565–5573 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Chinese Academy of Sciences, No.19(A) Yuquan Road, Shijingshan District, Beijing, 100049, China
Manli Tao, Jinqiao Wang & Ming Tang
Institute of Automation Chinese Academy of Sciences, 95 Zhongguancun East Road, Beijing, 100190, China
Manli Tao, Chaoyang Zhao, Jinqiao Wang & Ming Tang

Authors

Manli Tao
View author publications
You can also search for this author in PubMed Google Scholar
Chaoyang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jinqiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ming Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manli Tao .

Editor information

Editors and Affiliations

Southern University of Science and Technology, Shenzhen, China
Shiqi Yu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Zhaoxiang Zhang
Hong Kong Baptist University, Hong Kong, China
Pong C. Yuen
Northwestern Polytechnical University, Xi'an, China
Junwei Han
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hong Kong Baptist University, Hong Kong, China
Yike Guo
Sun Yat-sen University, Guangzhou, China
Jianhuang Lai
Southern University of Science and Technology, Shenzhen, China
Jianguo Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tao, M., Zhao, C., Wang, J., Tang, M. (2022). Global Patch Cross-Attention for Point Cloud Analysis. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13536. Springer, Cham. https://doi.org/10.1007/978-3-031-18913-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-18913-5_8
Published: 27 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18912-8
Online ISBN: 978-3-031-18913-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Global Patch Cross-Attention for Point Cloud Analysis