Abstract
3D point cloud recognition is fundamental and popular in vision perceptual systems such as autonomous driving, robotics, and virtual reality. Due to the sparse distribution and irregularity of point clouds, previous 3D point networks perform convolution on nearby points, ignoring the long-range dependence on the global structure. To solve this problem, we propose a Viewport Group Point Cloud Network for 3D Shape Recognition (VGPCNet) in which features are grouped according to viewports instead of local neighbor points to model the long-range global context. First, we propose to use viewport as proxy to capture both local and global features from an outside view of the object. The related points are grouped by visibility attribute effectively and efficiently which can not only capture the inside local geometry details but also obtain the global structure from the outside viewport. Second, we use a graph-based feature consolidation module to enhance the viewport features by modeling interactions between different viewports. Finally, to aggregate a global representation from multiple viewport features, we propose a novel attention-based feature aggregation module. We evaluate our VGPCNet on three widely used benchmarks including ModelNet40/10, ScanObjectNN, and ShapeCore55 for shape classification and retrieval tasks. Extensive experiments have demonstrated the effectiveness and superior performance (94.1% on ModelNet40) of our method over state-of-the-art methods.
Similar content being viewed by others
References
Yin J, Shen J, Guan C , Zhou D, Yang R (2020) Lidar-based online 3d video object detection with graph-based message passing and spatiotemporal transformer attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11495–11504
Yi X, Zhu A, Yang S X, Luo C (2017) A bio-inspired approach to task assignment of swarm robots in 3-d dynamic environments. IEEE Trans Cybern 47(4):974–983
Choe J, Im S, Rameau F, Kang M, Kweon IS (2021) Volumefusion: deep depth fusion for 3d scene reconstruction. In: Proceedings of the IEEE International Conference on Computer Vision, pp 16086–16095
Zhang Z, Da F, Yu Y (2022) Learning directly from synthetic point clouds for “in-the-wild” 3d face recognition. Pattern Recog 123:108394
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp 945–953
Dai A, Nießner M (2018) 3dmv Joint 3d-multi-view prediction for 3d semantic scene segmentation. In: Proceedings of the European Conference on Computer Vision, pp 452–468
Wang W, Wang T, Cai Y (2021) Multi-view attention-convolution pooling network for 3d point cloud classification. Appl Intell
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1912–1920
Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 652–660
Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the advances in neural information processing systems, pp 5099–5108
Li D, He K, Wang L, Zhang D (2021) Local feature extraction network with high correspondences for 3d point cloud registration. Appl Intell
Yue X, Liu Z, Zhu J, Gao X, Yang B, Tian Y (2021) Coarse-fine point cloud registration based on local point-pair features and the iterative closest point algorithm. Appl Intell
Hu M, Ye H, Cao F (2021) Convolutional neural networks with hybrid weights for 3d point cloud classification. Appl Intell
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems 30
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Xu M, Zhou Z, Qiao Y (2020) Geometry sharing network for 3d point cloud classification and segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 12500–12507
Xiang T, Zhang C, Song Y, Yu J, Cai W (2021) Walk in the cloud: learning curves for point clouds shape analysis. In: Proceedings of the IEEE international conference on computer vision, pp 915–924
Han Z, Wang X, Vong C M, Liu Y-S, Zwicker M, Chen CLP (2019) 3dviewgraph: learning global features for 3d shapes from a graph of unordered views with attention. In: Proceedings of the international joint conference on artificial intelligence, pp 758–765
Wei X, Yu R, Sun J (2020) View-gcn: view-based graph convolutional network for 3d shape analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1847–1856
Duan Y, Zheng Y, Lu J, Zhou J, Tian Q (2019) Structural relational reasoning of point clouds. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 949–958
Liu Y, Fan B, Xiang S, Pan C (2019) Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8895–8904
Thomas H, Qi CR, Deschaud J-E, Marcotegui B, Goulette F, Guibas LJ (2019) Kpconv: flexible and deformable convolution for point clouds. In: Proceedings of the IEEE international conference on computer vision, pp 6411–6420
Wu W, Qi Z, Fuxin L (2019) Pointconv: deep convolutional networks on 3d point clouds. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9621–9630
Xu M, Zhou Z, Qiao Y (2020) Geometry sharing network for 3d point cloud classification and segmentation. In: Proceedings of the AAAI conference on artificial intelligence, pp 12500–12507
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph cnn for learning on point clouds. ACM Trans Graph 38(5):1–12
Zhang K, Hao M, Wang J, De Silva CW, Fu C (2021) Linked dynamic graph cnn: learning on point cloud via linking hierarchical features. In: Proceedings of the international conference on mechatronics and machine vision in practice, pp 7–12
Yang Y, Feng C, Shen Y, Tian D (2018) Foldingnet: point cloud auto-encoder via deep grid deformation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 206–215
Te G, Hu W, Zheng A, Guo Z (2018) Rgcnn: regularized graph cnn for point cloud segmentation. In: Proceedings of the ACM international conference on multimedia, pp 746–754
Feng Y, You H, Zhang Z, Ji R, Gao Y (2019) Hypergraph neural networks. In: Proceedings of the AAAI conference on artificial intelligence, pp 3558–3565
Zhang Y, Rabbat M (2018) A graph-cnn for 3d point cloud classification. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, pp 6279–6283
Xu Y, Fan T, Xu M, Zeng L, Qiao Y (2018) Spidercnn: deep learning on point sets with parameterized convolutional filters. In: Proceedings of the european conference on computer vision, pp 87–102
McCormac J, Handa A, Davison A, Leutenegger S (2017) Semanticfusion : dense 3d semantic mapping with convolutional neural networks. In: Proceedings of the IEEE international conference on robotics and automation, IEEE, pp 4628–4635
Lawin FJ, Danelljan M, Tosteberg P, Bhat G, Khan FS, Felsberg M (2017) Deep projective 3d semantic segmentation. In: International conference on computer analysis of images and patterns, Springer, pp 95–107
Kanezaki A, Matsushita Y, Nishida Y (2018) Rotationnet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5010–5019
Han Z, Liu X, Liu Y-S, Zwicker M (2019) Parts4feature: learning 3d global features from generally semantic parts in multiple views. In: Proceedings of the international joint conference on artificial intelligence, pp 766–773
Tatarchenko M, Park J, Koltun V, Zhou Q-Y (2018) Tangent convolutions for dense prediction in 3d. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3887–3896
Lin Y, Yan Z, Huang H, Du D, Liu L, Cui S, Han X (2020) Fpconv : learning local flattening for point convolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4293–4302
Huang J, Zhang H, Yi L, Funkhouser T, Nießner M, Guibas LJ (2019) Texturenet : consistent local parametrizations for learning from high-resolution signals on meshes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4440–4449
You H, Feng Y, Ji R, Gao Y (2018) Pvnet : a joint convolutional network of point cloud and multi-view for 3d shape recognition. In: Proceedings of the ACM international conference on multimedia, pp 1310–1318
Haoxuan Y, Yifan F, Xibin Z, Changqing Z, Ji R, Gao Y (2019) Pvrnet: point-view relation neural network for 3d shape recognition. In: Proceedings of the the AAAI conference on artificial intelligence, pp 9119–9126
Jaritz M, Gu J, Su H (2019) Multi-view pointnet for 3d scene understanding. In: Proceedings of the international conference on computer vision workshop, pp 3995–4003
Katz S, Tal A, Basri R (2007) Direct visibility of point sets. ACM Trans Graph 26(3):24
Mehra R, Tripathi P, Sheffer A, Mitra NJ (2010) Visibility of noisy point cloud data. Comput Graph 34(3):219–230
Uy MA, Pham Q-H, Hua B-S, Nguyen T, Yeung S-K (2019) Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data. In: Proceedings of the IEEE international conference on computer vision
Yi L, Kim VG, Ceylan D, Shen I-C, Yan M, Su H, Lu C, Huang Q, Sheffer A, Guibas L (2016) A scalable active framework for region annotation in 3d shape collections. ACM Trans Graph 35(6)
Feng Y, Zhang Z, Zhao X, Ji R, Gao Y (2018) Gvcnn : Group-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 264–272
Nie W, Zhao Y, Song D, Gao Y (2021) Dan: deep-attention network for 3d shape recognition. IEEE Trans Image Process 30:4371–4383
Li J, Chen BM, Hee Lee G (2018) So-net : self-organizing network for point cloud analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9397–9406
Yang J, Zhang Q, Ni B, Li L, Liu J, Zhou M, Tian Q (2019) Modeling point clouds with self-attention and gumbel subset sampling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3323–3332
Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) Pointcnn : convolution on x-transformed points. In: Advances in neural information processing systems, pp 820–830
Fei J, Zhu Z, Liu W, Deng Z, Li M, Deng H, Zhang S (2022) Dumlp-pin: a dual-mlp-dot-product permutation-invariant network for set feature extraction. In: Proceedings of the AAAI conference on artificial intelligence, pp 598–606
Yan X, Zheng C, Li Z, Wang S, Cui S (2020) Pointasnl : robust point clouds processing using nonlocal neural networks with adaptive sampling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5589–5598
Chen J, Kakillioglu B, Ren H, Velipasalar S (2022) Why discard if you can recycle?: a recycling max pooling module for 3d point cloud analysis. In: Proceedings of the IEEE international conference on computer vision, pp 559–567
Xu Q, Sun X, Wu C-Y, Wang P, Neumann U (2020) Grid-gcn for fast and scalable point cloud learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5661–5670
Zhao H, Jiang L, Jia J, Torr PHS, Koltun V (2021) Point transformer. In: Proceedings of the IEEE international conference on computer vision, pp 16259–16268
Qian G, Li Y, Peng H, Mai J, Hammoud HAAK, Elhoseiny M, Ghanem B (2022) Pointnext : revisiting pointnet++ with improved training and scaling strategies. In: Adv Neural inform Process Syst
Qiu S, Anwar S, Barnes N (2022) Geometric back-projection network for point cloud classification. IEEE Trans Multimedia 24:1943–1955
Xu M, Zhang J, Zhou Z, Xu M, Qi X, Qiao Y (2021) Learning geometry-disentangled representation for complementary understanding of 3d object point cloud. In: Proceedings of the AAAI conference on artificial intelligence, pp 3056–3064
Xu M, Ding R, Zhao H, Qi X (2021) Paconv: position adaptive convolution with dynamic kernel assembling on point clouds. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3173–3182
Ran H, Zhuo W, Liu J, Lu L (2021) Learning inner-group relations on point clouds. In: Proceedings of the IEEE international conference on computer vision, pp 15477–15487
Ran H, Liu J, Wang C (2022) Surface representation for point clouds. In: Proceedings of the IEEE international conference on computer vision, pp 18942–18952
Hegde V, Zadeh R (2016) Fusionnet: 3d object classification using multiple data representations. In: Proceedings of the neural information processing systems workshop
Nie W, Liang Q, Liu A-A, Mao Z, Li Y (2019) Mmjn: multi-modal joint networks for 3d shape recognition. In: Proceedings of the ACM international conference on multimedia, pp 908–916
Xu Y, Zheng C, Xu R, Quan Y, Ling H (2021) Multi-view 3d shape recognition via correspondence-aware deep learning. IEEE Trans Image Process 30:5299–5312
Nie W, Liang Q, Wang Y, Wei X, Su Y (2020) Mmfn: multimodal information fusion networks for 3d model classification and retrieval. ACM Transactions on Multimedia Computing Communications and Applications 16(4)
Yuan W, Khot T, Held D, Mertz C, Hebert M (2018) Pcn: point completion network. In: Proceedings of the international conference on 3D vision, pp 728–737
Huang Z, Yu Y, Xu J, Ni F, Le X (2020) Pf-net : point fractal network for 3d point cloud completion. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7659–7667
Funding
This study was supported by the Special Project on Basic Research of Frontier Leading Technology of Jiangsu Province of China [Grant No. BK20192004C] and the Natural Science Foundation of Jiangsu Province of China [Grant No. BK20181269].
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Z., Yu, Y. & Da, F. VGPCNet: viewport group point clouds network for 3D shape recognition. Appl Intell 53, 19060–19073 (2023). https://doi.org/10.1007/s10489-023-04498-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04498-4