A new self-augment CNN for 3D point cloud classification and segmentation

Meng, Xinhong; Lu, Xinyu; Ye, Hailiang; Yang, Bing; Cao, Feilong

doi:10.1007/s13042-023-01940-4

A new self-augment CNN for 3D point cloud classification and segmentation

Original Article
Published: 17 August 2023

Volume 15, pages 807–818, (2024)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Xinhong Meng¹,
Xinyu Lu²,
Hailiang Ye²,
Bing Yang² &
…
Feilong Cao²

263 Accesses
Explore all metrics

Abstract

Point cloud classification and segmentation are challenging tasks due to the irregular structures, especially when there is translation variance in the point clouds. To overcome this barrier, this paper proposes a self-augment convolutional neural network (SACNN), which can not only extract more discriminative features from the points cloud but also alleviate the translation variance problem. Specifically, we first represent the point cloud through the dynamic graph, with the intention to keep the number of point clouds during the feature learning to avoid information loss. Benefiting from the dynamic graph, the global and local features of point clouds can be learned. Then, to reduce the translation variance in the dynamic graphs, a self-augment convolution (SAConv) module is designed to make points align their coordinates based on learned features. Finally, the local mixed aggregation module is proposed to combine the overview and the detailed descriptor of the neighbors. Experiments on several standard benchmarks verify the superiority of the SACNN over state-of-the-art methods in both 3D point cloud classification and segmentation tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion Based Classification

Deep learning in multi-object detection and tracking: state of the art

Article 09 April 2021

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.

References

Zhu Y, Mottaghi R, Kolve E, Lim JJ, Gupta A, Li FF, Farhadi A (2017) Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: Proceedings of the IEEE International Conference on Robotics and Automation, IEEE, Singapore, Malaysia, pp 3357–3364
Liang X, Wang T, Yang L, Xing E (2018) Cirl: Controllable imitative reinforcement learning for vision-based self-driving. In: Proceedings of the European Conference on Computer Vision, Munich, Germany, pp 584–599
Rusu RB, Marton ZC, Blodow N, Dolha M, Beetz M (2008) Towards 3D point cloud based object maps for household environments. Robot Autonom Syst 56(11):927–941
Article Google Scholar
Golovinskiy A, Funkhouser T (2009) Consistent segmentation of 3D models. Comp Graph 33(3):262–269
Article Google Scholar
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, pp 580–587
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of International Conference on Learning Representations, San Diego, USA, pp 1–14
Maturana D, Scherer S (2015) Voxnet: A 3D convolutional neural network for real-time object recognition. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, Hamburg, Germany, pp 922–928
Choy CB, Xu D, Gwak J, Chen K, Savarese S (2016) 3D-R2N2: A unified approach for single and multi-view 3D object reconstruction. In: Proceedings of the European Conference on Computer Vision, Springer, Amsterdam, Netherlands, pp 628–644
Lawin FJ, Danelljan M, Tosteberg P, Bhat G, Khan FS, Felsberg M (2017) Deep projective 3D semantic segmentation. In: Proceedings of International Conference on Computer Analysis of Images and Patterns, Springer, Ystad, Sweden, pp 95–107
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D ShapeNets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp 1912–1920
Charles RQ, Su H, Kaichun M, Guibas LJ (2017a) PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp 652–660
Charles RQ, Yi L, Su H, Guibas LJ (2017) PointNet++: Deep hierarchical feature learning on point sets in a metric space. Advances in Neural Information Processing Systems. Long Beach, USA, pp 5099–5108
Guo R, Zhou Y, Zhao J, Man Y, Liu M, Yao R, Liu B (2021) Point cloud classification by dynamic graph CNN with adaptive feature fusion. IET Comp Vis 15(3):235–244
Article Google Scholar
Xu M, Zhou Z, Zhang J, Qiao Y (2021) Investigate indistinguishable points in semantic segmentation of 3D point cloud. arXiv preprint arXiv:2103.10339
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph CNN for learning on point clouds. ACM Transact Graph 38(5):1–12
Article Google Scholar
Wang L, Huang Y, Hou Y, Zhang S, Shan J (2019) Graph attention convolution for point cloud semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 10296–10305
Simonovsky M, Komodakis N (2017) Dynamic edge-conditioned filters in convolutional neural networks on graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp 3693–3702
Shi W, Rajkumar R (2020) Point-gnn: Graph neural network for 3D object detection in a point cloud. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 1711–1719
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, pp 945–953
Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view CNNs for object classification on 3D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp 5648–5656
Meng HY, Gao L, Lai YK, Manocha D (2019) VV-Net: Voxel vae net with group convolutions for point cloud segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea (South), pp 8500–8508
Riegler G, Osman Ulusoy A, Geiger A (2017) OctNet: Learning deep 3D representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp 3577–3586
Wang PS, Liu Y, Guo YX, Sun CY, Tong X (2017) O-CNN: Octree-based convolutional neural networks for 3D shape analysis. ACM Transact Graph 36(4):1–11
CAS Google Scholar
Huang Q, Wang W, Neumann U (2018) Recurrent slice networks for 3D segmentation of point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp 2626–2635
Klokov R, Lempitsky V (2017) Escape from cells: Deep kd-networks for the recognition of 3D point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, pp 863–872
Gadelha M, Wang R, Maji S (2018) Multiresolution tree networks for 3D point cloud processing. In: Proceedings of the European Conference on Computer Vision, Munich, Germany, pp 103–118
Landrieu L, Boussaha M (2019) Point cloud oversegmentation with graph-structured deep metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 7440–7449
Lan S, Yu R, Yu G, Davis LS (2019) Modeling local geometric structure of 3D point clouds using Geo-CNN. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 998–1008
Bi Y, Chadha A, Abbas A, Bourtsoulatze E, Andreopoulos Y (2019) Graph-based object classification for neuromorphic vision sensing. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea (South), pp 491–501
Wang C, Samari B, Siddiqi K (2018) Local spectral graph convolution for point set feature learning. In: Proceedings of the European Conference on Computer Vision, Munich, Germany, pp 52–66
Xu Y, Fan T, Xu M, Zeng L, Qiao Y (2018) SpiderCNN: Deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European Conference on Computer Vision, Munich, Germany, pp 87–102
Liu Y, Fan B, Xiang S, Pan C (2019) Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 8895–8904
Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O (2016) 3D U-Net: Learning dense volumetric segmentation from sparse annotation. In: Proceedings of International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Athens, Greece, pp 424–432
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Yan X, Zheng C, Li Z, Wang S, Cui S (2020) PointASNL: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 5589–5598
Li J, Chen BM, Lee GH (2018) So-Net: Self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp 9397–9406
Boulch A (2020) ConvPoint: Continuous convolutions for point cloud processing. Comp Graph 88:24–34
Article Google Scholar
Thomas H, Qi CR, Deschaud JE, Marcotegui B, Goulette F, Guibas LJ (2019) KPConv: Flexible and deformable convolution for point clouds. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea (South), pp 6411–6420
Liu Z, Hu H, Cao Y, Zhang Z, Tong X (2020) A closer look at local aggregation operators in point cloud analysis. In: Proceedings of the European Conference on Computer Vision, Springer, Glasgow, US, pp 326–342
Zhang J, Cao Y, Wang Y, Wen C, Chen CW (2018) Fully point-wise convolutional neural network for modeling statistical regularities in natural images. In: Proceedings of the 26th ACM international conference on Multimedia, Seoul, Korea, pp 984–992
Yang J, Zhang Q, Ni B, Li L, Liu J, Zhou M, Tian Q (2019) Modeling point clouds with self-attention and gumbel subset sampling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 3323–3332
Le T, Duan Y (2018) PointGrid: A deep network for 3D shape understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp 9204–9214
Lin ZH, Huang SY, Wang YCF (2020) Convolution in the cloud: Learning deformable kernels in 3D graph convolution networks for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 1800–1809
Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) PointCNN: Convolution on \(\cal{X}\)-transformed points. Advances in Neural Information Processing Systems. Montreal, Canada, pp 820–830
Atzmon M, Maron H, Lipman Y (2018) Point convolutional neural networks by extension operators. ACM Transact Graph 37(4):1–12
Article Google Scholar
Wu W, Qi Z, Fuxin L (2019) PointConv: Deep convolutional networks on 3D point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 9621–9630
Lin Y, Yan Z, Huang H, Du D, Liu L, Cui S, Han X (2020) FPConv: Learning local flattening for point convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 4293–4302
Komarichev A, Zhong Z, Hua J (2019) A-CNN: Annularly convolutional neural networks on point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp 7421–7430
Liu X, Han Z, Liu YS, Zwicker M (2019) Point2Sequence: Learning the shape representation of 3D point clouds with an attention-based sequence to sequence network. Proc AAAI Conf Artif Intell, Honolulu, USA 33:8778–8785
Google Scholar
Xu Q, Sun X, Wu CY, Wang P, Neumann U (2020) Grid-gcn for fast and scalable point cloud learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 5661–5670
Han W, Wen C, Wang C, Li X, Li Q (2020) Point2Node: Correlation learning of dynamic-node for point cloud feature modeling. Proc AAAI Conf Artif Intell, New York, USA 34:10925–10932
Google Scholar
Wu W, Qi Z, Fuxin L (2019) Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea (South), pp 1588–1597
Ben-Shabat Y, Lindenbaum M, Fischer A (2018) 3DmFV: Three-dimensional point cloud classification in real-time using convolutional neural networks. IEEE Robot Automat Lett 3(4):3145–3152
Article Google Scholar
Rao Y, Lu J, Zhou J (2020) Global-local bidirectional reasoning for unsupervised representation learning of 3D point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp 5376–5385
Chang AX, Funkhouser T, Guibas L, Hanrahan P, Huang Q, Li Z, Savarese S, Savva M, Song S, Su H, et al. (2015) ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of Fujian Province of China (Grants No. 2021J01540 and No. 2021J05106) and the National Natural Science Foundation of China (Grant No. 62032022, 62006215).

Author information

Authors and Affiliations

School of Electronics and Communication Engineering, Quanzhou University of Information Engineering, Quanzhou, 362000, Fujian, China
Xinhong Meng
College of Sciences, China Jiliang University, Hangzhou, 310018, Zhejiang, China
Xinyu Lu, Hailiang Ye, Bing Yang & Feilong Cao

Authors

Xinhong Meng
View author publications
You can also search for this author in PubMed Google Scholar
Xinyu Lu
View author publications
You can also search for this author in PubMed Google Scholar
Hailiang Ye
View author publications
You can also search for this author in PubMed Google Scholar
Bing Yang
View author publications
You can also search for this author in PubMed Google Scholar
Feilong Cao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Feilong Cao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Meng, X., Lu, X., Ye, H. et al. A new self-augment CNN for 3D point cloud classification and segmentation. Int. J. Mach. Learn. & Cyber. 15, 807–818 (2024). https://doi.org/10.1007/s13042-023-01940-4

Download citation

Received: 02 May 2023
Accepted: 23 July 2023
Published: 17 August 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s13042-023-01940-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new self-augment CNN for 3D point cloud classification and segmentation

Abstract

Access this article

Similar content being viewed by others

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion Based Classification

Deep learning in multi-object detection and tracking: state of the art

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A new self-augment CNN for 3D point cloud classification and segmentation

Abstract

Access this article

Similar content being viewed by others

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion Based Classification

Deep learning in multi-object detection and tracking: state of the art

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation