Skip to main content
Log in

A new self-augment CNN for 3D point cloud classification and segmentation

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Point cloud classification and segmentation are challenging tasks due to the irregular structures, especially when there is translation variance in the point clouds. To overcome this barrier, this paper proposes a self-augment convolutional neural network (SACNN), which can not only extract more discriminative features from the points cloud but also alleviate the translation variance problem. Specifically, we first represent the point cloud through the dynamic graph, with the intention to keep the number of point clouds during the feature learning to avoid information loss. Benefiting from the dynamic graph, the global and local features of point clouds can be learned. Then, to reduce the translation variance in the dynamic graphs, a self-augment convolution (SAConv) module is designed to make points align their coordinates based on learned features. Finally, the local mixed aggregation module is proposed to combine the overview and the detailed descriptor of the neighbors. Experiments on several standard benchmarks verify the superiority of the SACNN over state-of-the-art methods in both 3D point cloud classification and segmentation tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.

References

  1. Zhu Y, Mottaghi R, Kolve E, Lim JJ, Gupta A, Li FF, Farhadi A (2017) Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: Proceedings of the IEEE International Conference on Robotics and Automation, IEEE, Singapore, Malaysia, pp 3357–3364

  2. Liang X, Wang T, Yang L, Xing E (2018) Cirl: Controllable imitative reinforcement learning for vision-based self-driving. In: Proceedings of the European Conference on Computer Vision, Munich, Germany, pp 584–599

  3. Rusu RB, Marton ZC, Blodow N, Dolha M, Beetz M (2008) Towards 3D point cloud based object maps for household environments. Robot Autonom Syst 56(11):927–941

    Article  Google Scholar 

  4. Golovinskiy A, Funkhouser T (2009) Consistent segmentation of 3D models. Comp Graph 33(3):262–269

    Article  Google Scholar 

  5. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, pp 580–587

  6. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of International Conference on Learning Representations, San Diego, USA, pp 1–14

  7. Maturana D, Scherer S (2015) Voxnet: A 3D convolutional neural network for real-time object recognition. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, Hamburg, Germany, pp 922–928

  8. Choy CB, Xu D, Gwak J, Chen K, Savarese S (2016) 3D-R2N2: A unified approach for single and multi-view 3D object reconstruction. In: Proceedings of the European Conference on Computer Vision, Springer, Amsterdam, Netherlands, pp 628–644

  9. Lawin FJ, Danelljan M, Tosteberg P, Bhat G, Khan FS, Felsberg M (2017) Deep projective 3D semantic segmentation. In: Proceedings of International Conference on Computer Analysis of Images and Patterns, Springer, Ystad, Sweden, pp 95–107

  10. Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D ShapeNets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp 1912–1920

  11. Charles RQ, Su H, Kaichun M, Guibas LJ (2017a) PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp 652–660

  12. Charles RQ, Yi L, Su H, Guibas LJ (2017) PointNet++: Deep hierarchical feature learning on point sets in a metric space. Advances in Neural Information Processing Systems. Long Beach, USA, pp 5099–5108

  13. Guo R, Zhou Y, Zhao J, Man Y, Liu M, Yao R, Liu B (2021) Point cloud classification by dynamic graph CNN with adaptive feature fusion. IET Comp Vis 15(3):235–244

    Article  Google Scholar 

  14. Xu M, Zhou Z, Zhang J, Qiao Y (2021) Investigate indistinguishable points in semantic segmentation of 3D point cloud. arXiv preprint arXiv:2103.10339

  15. Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph CNN for learning on point clouds. ACM Transact Graph 38(5):1–12

    Article  Google Scholar 

  16. Wang L, Huang Y, Hou Y, Zhang S, Shan J (2019) Graph attention convolution for point cloud semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 10296–10305

  17. Simonovsky M, Komodakis N (2017) Dynamic edge-conditioned filters in convolutional neural networks on graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp 3693–3702

  18. Shi W, Rajkumar R (2020) Point-gnn: Graph neural network for 3D object detection in a point cloud. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 1711–1719

  19. Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, pp 945–953

  20. Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view CNNs for object classification on 3D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp 5648–5656

  21. Meng HY, Gao L, Lai YK, Manocha D (2019) VV-Net: Voxel vae net with group convolutions for point cloud segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea (South), pp 8500–8508

  22. Riegler G, Osman Ulusoy A, Geiger A (2017) OctNet: Learning deep 3D representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp 3577–3586

  23. Wang PS, Liu Y, Guo YX, Sun CY, Tong X (2017) O-CNN: Octree-based convolutional neural networks for 3D shape analysis. ACM Transact Graph 36(4):1–11

    CAS  Google Scholar 

  24. Huang Q, Wang W, Neumann U (2018) Recurrent slice networks for 3D segmentation of point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp 2626–2635

  25. Klokov R, Lempitsky V (2017) Escape from cells: Deep kd-networks for the recognition of 3D point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, pp 863–872

  26. Gadelha M, Wang R, Maji S (2018) Multiresolution tree networks for 3D point cloud processing. In: Proceedings of the European Conference on Computer Vision, Munich, Germany, pp 103–118

  27. Landrieu L, Boussaha M (2019) Point cloud oversegmentation with graph-structured deep metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 7440–7449

  28. Lan S, Yu R, Yu G, Davis LS (2019) Modeling local geometric structure of 3D point clouds using Geo-CNN. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 998–1008

  29. Bi Y, Chadha A, Abbas A, Bourtsoulatze E, Andreopoulos Y (2019) Graph-based object classification for neuromorphic vision sensing. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea (South), pp 491–501

  30. Wang C, Samari B, Siddiqi K (2018) Local spectral graph convolution for point set feature learning. In: Proceedings of the European Conference on Computer Vision, Munich, Germany, pp 52–66

  31. Xu Y, Fan T, Xu M, Zeng L, Qiao Y (2018) SpiderCNN: Deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European Conference on Computer Vision, Munich, Germany, pp 87–102

  32. Liu Y, Fan B, Xiang S, Pan C (2019) Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 8895–8904

  33. Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O (2016) 3D U-Net: Learning dense volumetric segmentation from sparse annotation. In: Proceedings of International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Athens, Greece, pp 424–432

  34. Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  35. Yan X, Zheng C, Li Z, Wang S, Cui S (2020) PointASNL: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 5589–5598

  36. Li J, Chen BM, Lee GH (2018) So-Net: Self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp 9397–9406

  37. Boulch A (2020) ConvPoint: Continuous convolutions for point cloud processing. Comp Graph 88:24–34

    Article  Google Scholar 

  38. Thomas H, Qi CR, Deschaud JE, Marcotegui B, Goulette F, Guibas LJ (2019) KPConv: Flexible and deformable convolution for point clouds. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea (South), pp 6411–6420

  39. Liu Z, Hu H, Cao Y, Zhang Z, Tong X (2020) A closer look at local aggregation operators in point cloud analysis. In: Proceedings of the European Conference on Computer Vision, Springer, Glasgow, US, pp 326–342

  40. Zhang J, Cao Y, Wang Y, Wen C, Chen CW (2018) Fully point-wise convolutional neural network for modeling statistical regularities in natural images. In: Proceedings of the 26th ACM international conference on Multimedia, Seoul, Korea, pp 984–992

  41. Yang J, Zhang Q, Ni B, Li L, Liu J, Zhou M, Tian Q (2019) Modeling point clouds with self-attention and gumbel subset sampling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 3323–3332

  42. Le T, Duan Y (2018) PointGrid: A deep network for 3D shape understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp 9204–9214

  43. Lin ZH, Huang SY, Wang YCF (2020) Convolution in the cloud: Learning deformable kernels in 3D graph convolution networks for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 1800–1809

  44. Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) PointCNN: Convolution on \(\cal{X}\)-transformed points. Advances in Neural Information Processing Systems. Montreal, Canada, pp 820–830

  45. Atzmon M, Maron H, Lipman Y (2018) Point convolutional neural networks by extension operators. ACM Transact Graph 37(4):1–12

    Article  Google Scholar 

  46. Wu W, Qi Z, Fuxin L (2019) PointConv: Deep convolutional networks on 3D point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 9621–9630

  47. Lin Y, Yan Z, Huang H, Du D, Liu L, Cui S, Han X (2020) FPConv: Learning local flattening for point convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 4293–4302

  48. Komarichev A, Zhong Z, Hua J (2019) A-CNN: Annularly convolutional neural networks on point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 7421–7430

  49. Liu X, Han Z, Liu YS, Zwicker M (2019) Point2Sequence: Learning the shape representation of 3D point clouds with an attention-based sequence to sequence network. Proc AAAI Conf Artif Intell, Honolulu, USA 33:8778–8785

    Google Scholar 

  50. Xu Q, Sun X, Wu CY, Wang P, Neumann U (2020) Grid-gcn for fast and scalable point cloud learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 5661–5670

  51. Han W, Wen C, Wang C, Li X, Li Q (2020) Point2Node: Correlation learning of dynamic-node for point cloud feature modeling. Proc AAAI Conf Artif Intell, New York, USA 34:10925–10932

    Google Scholar 

  52. Wu W, Qi Z, Fuxin L (2019) Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea (South), pp 1588–1597

  53. Ben-Shabat Y, Lindenbaum M, Fischer A (2018) 3DmFV: Three-dimensional point cloud classification in real-time using convolutional neural networks. IEEE Robot Automat Lett 3(4):3145–3152

    Article  Google Scholar 

  54. Rao Y, Lu J, Zhou J (2020) Global-local bidirectional reasoning for unsupervised representation learning of 3D point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 5376–5385

  55. Chang AX, Funkhouser T, Guibas L, Hanrahan P, Huang Q, Li Z, Savarese S, Savva M, Song S, Su H, et al. (2015) ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of Fujian Province of China (Grants No. 2021J01540 and No. 2021J05106) and the National Natural Science Foundation of China (Grant No. 62032022, 62006215).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feilong Cao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meng, X., Lu, X., Ye, H. et al. A new self-augment CNN for 3D point cloud classification and segmentation. Int. J. Mach. Learn. & Cyber. 15, 807–818 (2024). https://doi.org/10.1007/s13042-023-01940-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-023-01940-4

Keywords

Navigation