Learning local contextual features for 3D point clouds semantic segmentation by attentive kernel convolution

Tong, Guofeng; Shao, Yuyuan; Peng, Hao

doi:10.1007/s00371-023-02819-9

Learning local contextual features for 3D point clouds semantic segmentation by attentive kernel convolution

Original article
Published: 19 March 2023

Volume 40, pages 831–847, (2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

496 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Unlike 2D images that are represented in regular grids, 3D point clouds are irregular and unordered, hence directly applying convolution neural networks (CNNs) to process point clouds is quite challenging. In this paper, we propose a novel deep neural network named AKNet to achieve point cloud semantic segmentation. The key to our AKNet is the attentive kernel convolution (AKConv), which is a deformed convolution operation for perceiving sufficient local context of 3D scenes. AKConv first constructs the Basic Weight Units that are robust to point’s ordering. Then, for capturing the more distinctive local features, the convolution kernels of AKConv are associated with Attentive Weight Units through the self-attentive function acting on Basic Weight Units. Furthermore, 3D point clouds provide richer geometric shape information, which is helpful to recognize objects. However, inputting only raw point features to the convolution function could cause geometric information loss. Thus, we utilize augmented features as input of AKConv. Besides, to preserve the semantic information from the encoding to decoding layers, we introduce the backward encoding (BE) mechanism by utilizing higher-layer semantic features. We conduct experiments on three large-scale point clouds datasets. The experimental results demonstrate that our AKNet outperforms state-of-the-art (SOTA) networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast Point Voxel Convolution Neural Network with Selective Feature Fusion for Point Cloud Semantic Segmentation

SPNet: Multi-shell Kernel Convolution for Point Cloud Semantic Segmentation

Point-attention Net: a graph attention convolution network for point cloudsegmentation

Article 03 September 2022

Data Availability

The datasets used in this paper are public datasets and can be obtained by contacting the relevant providers in Semantic3D [45], S3DIS [46] and SensatUrban [47].

References

Qi, C.R., Chen, X., Litany, O., et al.: Imvotenet: boosting 3D object detection in point clouds with image votes. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4404–4413 (2020)
Tang, Y., Li, L., Wang, C., et al.: Real-time detection of surface deformation and strain in recycled aggregate concrete-filled steel tubular columns via four-ocular vision. Robotics Comput. Integr. Manuf. 59, 36–46 (2019)
Article Google Scholar
Shao, Y., Tong, G., Peng, H.: Mining local geometric structure for large-scale 3D point clouds semantic segmentation. Neurocomputing 500, 191–202 (2022)
Article Google Scholar
Li, H., Sun, Z.: A structural-constraint 3D point clouds segmentation adversarial method. Vis. Comput. 37(2), 325–340 (2021)
Article Google Scholar
Tateno, K., Tombari, F., Navab, N.: Real-time and scalable incremental segmentation on dense slam. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2015, pp. 4465–4472 (2015)
Koppula, H. S., Anand, A., Joachims, T., Saxena, A.: Semantic labeling of 3D point clouds for indoor scenes. In: Advances in Neural Information Processing Systems, pp. 244–252 (2011)
Li, R., Zhang, Y., Niu, D., et al.: PointVGG: Graph convolutional network with progressive aggregating features on point clouds. Neurocomputing 429, 187–198 (2021)
Article Google Scholar
Wu, J., Jiao, J., Yang, Q., et al.: Ground-aware point cloud semantic segmentation for autonomous driving. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 971–979 (2019)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241 (2015)
Tang, Y., Chen, Z., Huang, Z., et al.: Visual measurement of dam concrete cracks based on U-net and improved thinning algorithm. J. Exp. Mech. 37(2), 209–220 (2022)
Google Scholar
Su, H., Maji, S., et al.: Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 945–953 (2015)
Boulch, A., Le Saux, B., et al.: Unstructured point cloud semantic labeling using deep segmentation networks. In: Workshop on 3D Object Retrieval (2017)
Graham, B., Engelcke, M., et al.: 3D semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9224–9232 (2018)
Tchapmi, L., Choy, C., Armeni, I., et al.: Segcloud: semantic segmentation of 3D point clouds. In: Proceedings of the International Conference on 3D Vision, pp. 537–547 (2017)
Klokov, R., Lempitsky, V.: Deep kd-networks for the recognition of 3D point cloud models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 863–872 (2017)
Riegler, G., Osman Ulusoy, A., Geiger, A.: OctNet: learning deep 3D representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3577–3586 (2017)
Qi, C.R., Su, H., Mo, K., et al.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Charles, R.Q., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the Conference and Workshop on Neural Information Processing Systems, Long Beach, CA, pp. 5099–5108 (2017)
Wang, Y., Sun, Y., Liu, Z., et al.: Dynamic graph CNN for learning on point clouds. In: ACM Transactions on Graphics, pp. 1–12 (2019)
Zhao, H., Jiang, L., Fu, C.W., et al.: Pointweb: enhancing local neighborhood features for point cloud processing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5565–5573 (2019)
Lan, S., Yu, R., Yu, G., et al.: Modeling local geometric structure of 3D point clouds using geo-CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 998–1008 (2019)
Jiang, M., Wu, Y., Zhao, T., et al.: PointSIFT: a SIFT-like network module for 3D point cloud semantic segmentation. arXiv:1807.00652 (2018)
Zhang, Z., Hua, B.S., Yeung, S.K.: ShellNet: efficient point cloud convolutional neural networks using concentric shells statistics. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1607–1616 (2019)
Hu, Q., Yang, B., Xie, L., et al.: RandLA-Net: efficient semantic segmentation of large-scale point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11108–11117 (2020)
Fan, S., Dong, Q., Zhu, F., et al.: SCF-Net: learning spatial contextual features for large-scale point cloud segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14504–14513 (2021)
Simonovsky, M., Komodakis, N.: Dynamic edge-conditioned filters in convolutional neural networks on graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, pp. 29–38 (2017)
Li, Y., Bu, R., Sun, M., et al.: PointCNN: convolution on X-transformed points. In: Proceedings of the Conference and Workshop on Neural Information Processing Systems, Montreal, Canada, pp. 820–830 (2018)
Wang, L., Huang, Y., Hou, Y., et al.: Graph attention convolution for point cloud semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10296–10305 (2019)
Liu, Y., Fan, B., Xiang, S., et al.: Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8895–8904 (2019)
Wu, W., Qi, Z., Fuxin, L.: Pointconv: deep convolutional networks on 3D point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9621–9630 (2019)
Thomas, H., Qi, C.R., Deschaud, J.E., et al.: Kpconv: flexible and deformable convolution for point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6411–6420 (2019)
Xu, M., Ding, R., Zhao, H., et al.: Paconv: position adaptive convolution with dynamic kernel assembling on point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3173–3182 (2021)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Shu, X., Yang, J., Yan, R., et al.: Expansion-squeeze-excitation fusion network for elderly activity recognition. IEEE Trans. Circuits Syst. Video Technol. 32(8), 5281–5292 (2022)
Article Google Scholar
Shi, W., Du, H., Mei, W., et al.: (SARN) spatial-wise attention residual network for image super-resolution. Vis. Comput. 37(6), 1569–1580 (2021)
Article Google Scholar
Shu, X., Zhang, L., Qi, G.J., et al.: Spatiotemporal co-attention recurrent neural networks for human-skeleton motion prediction. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 3300–3315 (2022)
Article Google Scholar
Kumar, N., Sukavanam, N.: Weakly supervised deep network for spatiotemporal localization and detection of human actions in wild conditions. Vis. Comput. 36(9), 1809–1821 (2020)
Article Google Scholar
Xu, B., Shu, X., Song, Y.: X-invariant contrastive augmentation and representation learning for semi-supervised skeleton-based action recognition. IEEE Trans. Image Process. 31(5), 3852–3867 (2022)
Article Google Scholar
Wang, P., Yao, W.: A new weakly supervised approach for ALS point cloud semantic segmentation. ISPRS J. Photogramm. Remote Sens. 188, 237–254 (2022)
Article Google Scholar
Hu, Q., Yang, B., Fang, G., et al.: Sqn: weakly-supervised semantic segmentation of large-scale 3D point clouds with 1000x fewer labels. arXiv preprint arXiv:2104.04891 (2021)
Thomas, H., Goulette, F., Deschaud, J., et al.: Semantic classification of 3D point clouds with multiscale spherical neighborhoods. In: 3DV, pp. 390–398 (2018)
Landrieu, L., Simonovsky, M.: Large-scale point cloud semantic segmentation with superpoint graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4558–4567 (2018)
Gong, J., Xu, J., Tan, X., et al.: Omni-supervised point cloud segmentation via gradual receptive field component reasoning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11673–11682 (2021)
Hackel, Timo, Savinov, N., Ladicky, L., Wegner, Jan D.: SEMANTIC3D.NET: a new large-scale scene point cloud classification benchmark. ISPRS J. Photogramm. Remote Sens. 91–98 (2017)
Armeni, I., Sener, O., Zamir, A.R., Jiang, H., Brilakis, I., Fischer, M., Savarese, S.: 3D semantic parsing of large-scale indoor spaces. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1534–1543 (2016)
Hu, Q., Yang, B., Khalid, S., et al.: Towards semantic segmentation of urban-scale 3D point clouds: a dataset, benchmarks and challenges. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4977–4987 (2021)
Tatarchenko, M., Park, J., Koltun, V., et al.: Tangent convolutions for dense prediction in 3D. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3887–3896 (2018)
Graham, B., Engelcke, M., Van Der Maaten, L.: 3D semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9224–9232 (2018)

Download references

Author information

Authors and Affiliations

College of Information Science and Engineering, Northeastern University, Shenyang, 110819, Liaoning, China
Guofeng Tong, Yuyuan Shao & Hao Peng

Authors

Guofeng Tong
View author publications
You can also search for this author in PubMed Google Scholar
Yuyuan Shao
View author publications
You can also search for this author in PubMed Google Scholar
Hao Peng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuyuan Shao.

Ethics declarations

Conflict of interest

No conflict of interest exits in the submission of this manuscript, and manuscript is approved by all authors for publication. I would like to declare on behalf of my co-authors that the work described was original research that has not been published previously. All the authors listed have approved the manuscript that is enclosed.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is supported by National Key R & D Program of China (Nos. 2019YFB1309905, 2020YFB1712802).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tong, G., Shao, Y. & Peng, H. Learning local contextual features for 3D point clouds semantic segmentation by attentive kernel convolution. Vis Comput 40, 831–847 (2024). https://doi.org/10.1007/s00371-023-02819-9

Download citation

Accepted: 26 February 2023
Published: 19 March 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s00371-023-02819-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning local contextual features for 3D point clouds semantic segmentation by attentive kernel convolution

Abstract

Access this article

Similar content being viewed by others

Fast Point Voxel Convolution Neural Network with Selective Feature Fusion for Point Cloud Semantic Segmentation

SPNet: Multi-shell Kernel Convolution for Point Cloud Semantic Segmentation

Point-attention Net: a graph attention convolution network for point cloudsegmentation

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning local contextual features for 3D point clouds semantic segmentation by attentive kernel convolution

Abstract

Access this article

Similar content being viewed by others

Fast Point Voxel Convolution Neural Network with Selective Feature Fusion for Point Cloud Semantic Segmentation

SPNet: Multi-shell Kernel Convolution for Point Cloud Semantic Segmentation

Point-attention Net: a graph attention convolution network for point cloudsegmentation

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation