Abstract
Point cloud classification and segmentation are challenging tasks due to the irregular structures, especially when there is translation variance in the point clouds. To overcome this barrier, this paper proposes a self-augment convolutional neural network (SACNN), which can not only extract more discriminative features from the points cloud but also alleviate the translation variance problem. Specifically, we first represent the point cloud through the dynamic graph, with the intention to keep the number of point clouds during the feature learning to avoid information loss. Benefiting from the dynamic graph, the global and local features of point clouds can be learned. Then, to reduce the translation variance in the dynamic graphs, a self-augment convolution (SAConv) module is designed to make points align their coordinates based on learned features. Finally, the local mixed aggregation module is proposed to combine the overview and the detailed descriptor of the neighbors. Experiments on several standard benchmarks verify the superiority of the SACNN over state-of-the-art methods in both 3D point cloud classification and segmentation tasks.
Similar content being viewed by others
Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.
References
Zhu Y, Mottaghi R, Kolve E, Lim JJ, Gupta A, Li FF, Farhadi A (2017) Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: Proceedings of the IEEE International Conference on Robotics and Automation, IEEE, Singapore, Malaysia, pp 3357–3364
Liang X, Wang T, Yang L, Xing E (2018) Cirl: Controllable imitative reinforcement learning for vision-based self-driving. In: Proceedings of the European Conference on Computer Vision, Munich, Germany, pp 584–599
Rusu RB, Marton ZC, Blodow N, Dolha M, Beetz M (2008) Towards 3D point cloud based object maps for household environments. Robot Autonom Syst 56(11):927–941
Golovinskiy A, Funkhouser T (2009) Consistent segmentation of 3D models. Comp Graph 33(3):262–269
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, pp 580–587
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of International Conference on Learning Representations, San Diego, USA, pp 1–14
Maturana D, Scherer S (2015) Voxnet: A 3D convolutional neural network for real-time object recognition. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, Hamburg, Germany, pp 922–928
Choy CB, Xu D, Gwak J, Chen K, Savarese S (2016) 3D-R2N2: A unified approach for single and multi-view 3D object reconstruction. In: Proceedings of the European Conference on Computer Vision, Springer, Amsterdam, Netherlands, pp 628–644
Lawin FJ, Danelljan M, Tosteberg P, Bhat G, Khan FS, Felsberg M (2017) Deep projective 3D semantic segmentation. In: Proceedings of International Conference on Computer Analysis of Images and Patterns, Springer, Ystad, Sweden, pp 95–107
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D ShapeNets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp 1912–1920
Charles RQ, Su H, Kaichun M, Guibas LJ (2017a) PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp 652–660
Charles RQ, Yi L, Su H, Guibas LJ (2017) PointNet++: Deep hierarchical feature learning on point sets in a metric space. Advances in Neural Information Processing Systems. Long Beach, USA, pp 5099–5108
Guo R, Zhou Y, Zhao J, Man Y, Liu M, Yao R, Liu B (2021) Point cloud classification by dynamic graph CNN with adaptive feature fusion. IET Comp Vis 15(3):235–244
Xu M, Zhou Z, Zhang J, Qiao Y (2021) Investigate indistinguishable points in semantic segmentation of 3D point cloud. arXiv preprint arXiv:2103.10339
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph CNN for learning on point clouds. ACM Transact Graph 38(5):1–12
Wang L, Huang Y, Hou Y, Zhang S, Shan J (2019) Graph attention convolution for point cloud semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 10296–10305
Simonovsky M, Komodakis N (2017) Dynamic edge-conditioned filters in convolutional neural networks on graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp 3693–3702
Shi W, Rajkumar R (2020) Point-gnn: Graph neural network for 3D object detection in a point cloud. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 1711–1719
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, pp 945–953
Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view CNNs for object classification on 3D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp 5648–5656
Meng HY, Gao L, Lai YK, Manocha D (2019) VV-Net: Voxel vae net with group convolutions for point cloud segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea (South), pp 8500–8508
Riegler G, Osman Ulusoy A, Geiger A (2017) OctNet: Learning deep 3D representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp 3577–3586
Wang PS, Liu Y, Guo YX, Sun CY, Tong X (2017) O-CNN: Octree-based convolutional neural networks for 3D shape analysis. ACM Transact Graph 36(4):1–11
Huang Q, Wang W, Neumann U (2018) Recurrent slice networks for 3D segmentation of point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp 2626–2635
Klokov R, Lempitsky V (2017) Escape from cells: Deep kd-networks for the recognition of 3D point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, pp 863–872
Gadelha M, Wang R, Maji S (2018) Multiresolution tree networks for 3D point cloud processing. In: Proceedings of the European Conference on Computer Vision, Munich, Germany, pp 103–118
Landrieu L, Boussaha M (2019) Point cloud oversegmentation with graph-structured deep metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 7440–7449
Lan S, Yu R, Yu G, Davis LS (2019) Modeling local geometric structure of 3D point clouds using Geo-CNN. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 998–1008
Bi Y, Chadha A, Abbas A, Bourtsoulatze E, Andreopoulos Y (2019) Graph-based object classification for neuromorphic vision sensing. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea (South), pp 491–501
Wang C, Samari B, Siddiqi K (2018) Local spectral graph convolution for point set feature learning. In: Proceedings of the European Conference on Computer Vision, Munich, Germany, pp 52–66
Xu Y, Fan T, Xu M, Zeng L, Qiao Y (2018) SpiderCNN: Deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European Conference on Computer Vision, Munich, Germany, pp 87–102
Liu Y, Fan B, Xiang S, Pan C (2019) Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 8895–8904
Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O (2016) 3D U-Net: Learning dense volumetric segmentation from sparse annotation. In: Proceedings of International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Athens, Greece, pp 424–432
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Yan X, Zheng C, Li Z, Wang S, Cui S (2020) PointASNL: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 5589–5598
Li J, Chen BM, Lee GH (2018) So-Net: Self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp 9397–9406
Boulch A (2020) ConvPoint: Continuous convolutions for point cloud processing. Comp Graph 88:24–34
Thomas H, Qi CR, Deschaud JE, Marcotegui B, Goulette F, Guibas LJ (2019) KPConv: Flexible and deformable convolution for point clouds. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea (South), pp 6411–6420
Liu Z, Hu H, Cao Y, Zhang Z, Tong X (2020) A closer look at local aggregation operators in point cloud analysis. In: Proceedings of the European Conference on Computer Vision, Springer, Glasgow, US, pp 326–342
Zhang J, Cao Y, Wang Y, Wen C, Chen CW (2018) Fully point-wise convolutional neural network for modeling statistical regularities in natural images. In: Proceedings of the 26th ACM international conference on Multimedia, Seoul, Korea, pp 984–992
Yang J, Zhang Q, Ni B, Li L, Liu J, Zhou M, Tian Q (2019) Modeling point clouds with self-attention and gumbel subset sampling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 3323–3332
Le T, Duan Y (2018) PointGrid: A deep network for 3D shape understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp 9204–9214
Lin ZH, Huang SY, Wang YCF (2020) Convolution in the cloud: Learning deformable kernels in 3D graph convolution networks for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 1800–1809
Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) PointCNN: Convolution on \(\cal{X}\)-transformed points. Advances in Neural Information Processing Systems. Montreal, Canada, pp 820–830
Atzmon M, Maron H, Lipman Y (2018) Point convolutional neural networks by extension operators. ACM Transact Graph 37(4):1–12
Wu W, Qi Z, Fuxin L (2019) PointConv: Deep convolutional networks on 3D point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 9621–9630
Lin Y, Yan Z, Huang H, Du D, Liu L, Cui S, Han X (2020) FPConv: Learning local flattening for point convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 4293–4302
Komarichev A, Zhong Z, Hua J (2019) A-CNN: Annularly convolutional neural networks on point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 7421–7430
Liu X, Han Z, Liu YS, Zwicker M (2019) Point2Sequence: Learning the shape representation of 3D point clouds with an attention-based sequence to sequence network. Proc AAAI Conf Artif Intell, Honolulu, USA 33:8778–8785
Xu Q, Sun X, Wu CY, Wang P, Neumann U (2020) Grid-gcn for fast and scalable point cloud learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 5661–5670
Han W, Wen C, Wang C, Li X, Li Q (2020) Point2Node: Correlation learning of dynamic-node for point cloud feature modeling. Proc AAAI Conf Artif Intell, New York, USA 34:10925–10932
Wu W, Qi Z, Fuxin L (2019) Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea (South), pp 1588–1597
Ben-Shabat Y, Lindenbaum M, Fischer A (2018) 3DmFV: Three-dimensional point cloud classification in real-time using convolutional neural networks. IEEE Robot Automat Lett 3(4):3145–3152
Rao Y, Lu J, Zhou J (2020) Global-local bidirectional reasoning for unsupervised representation learning of 3D point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 5376–5385
Chang AX, Funkhouser T, Guibas L, Hanrahan P, Huang Q, Li Z, Savarese S, Savva M, Song S, Su H, et al. (2015) ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012
Acknowledgements
This work was supported by the Natural Science Foundation of Fujian Province of China (Grants No. 2021J01540 and No. 2021J05106) and the National Natural Science Foundation of China (Grant No. 62032022, 62006215).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Meng, X., Lu, X., Ye, H. et al. A new self-augment CNN for 3D point cloud classification and segmentation. Int. J. Mach. Learn. & Cyber. 15, 807–818 (2024). https://doi.org/10.1007/s13042-023-01940-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-023-01940-4