Skip to main content

High-Resolution Self-attention with Fair Loss for Point Cloud Segmentation

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14451))

Included in the following conference series:

  • 578 Accesses

  • The original version of this chapter was revised: In the PDF version of this chapter, both Fig. 4 and Fig. 5 erroneously showed the same figure. A correction to this chapter can be found at https://doi.org/10.1007/978-981-99-8073-4_46

Abstract

Applying deep learning techniques to analyze point cloud data has emerged as a prominent research direction. However, the insufficient spatial and feature information integration within point cloud and unbalanced classes in real-world datasets have hindered the advancement of research. Given the success of self-attention mechanisms in numerous domains, we apply the High-Resolution Self-Attention (HRSA) module as a plug-and-play solution for point cloud segmentation. The proposed HRSA module preserve high-resolution internal representations in both spatial and feature dimensions. Additionally, by affecting the gradient of dominant and weak classes, we introduce the Fair Loss to address the problem of unbalanced class distribution on a real-world dataset to improve the network’s inference capabilities. The introduced modules are seamlessly integrated into an MLP-based architecture tailored for large-scale point cloud processing, resulting in a new segmentation network called PointHR. PointHR achieves impressive performance with mIoU scores of 69.8% and 74.5% on S3DIS Area-5 and 6-fold cross-validation. With a significantly smaller number of parameters, these performances make PointHR highly competitive in point cloud semantic segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Change history

  • 15 December 2023

    A correction has been published.

References

  1. Guo, Y., Wang, H., Hu, Q., Liu, H., Liu, L., Bennamoun, M.: Deep learning for 3D point clouds: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 43(12), 4338–4364 (2021)

    Google Scholar 

  2. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference Computer Vision Pattern Recognition (CVPR), vol. 1, p. 4 (2017)

    Google Scholar 

  3. Qian, G., et al.: PointNeXt: revisiting PointNet++ with improved training and scaling strategies(2022). arXiv:2206.04670

  4. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: "PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in neural information processing systems , pp. 5099–5108 (2017)

    Google Scholar 

  5. Zhao, H., Jiang, L., Fu, C.W., Jia, J.: Pointweb: enhancing local neighborhood features for point cloud processing. In: Proeedings of the IEEE Conference Computer Vision Pattern Recognition (CVPR), pp. 5565–5573 (2019)

    Google Scholar 

  6. Zhang, Y., Hu, Q., Xu, G., Ma, Y., Wan, J., Guo, Y.: Not all points are equal: learning highly efficient point-based detectors for 3D LiDAR point clouds. In: IEEE/CVF Conference Computer Vision Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 18931–18940 (2022)

    Google Scholar 

  7. Hu, Q., et al.: RandLA-net: efficient semantic segmentation of large-scale point clouds. In: Proceedings of IEEE/CVF Conference Computer Vision Pattern Recognition (CVPR), pp. 11108–11117 (2020)

    Google Scholar 

  8. Guo, M.-H., et al.: Point cloud transformer. Comput. Vis. Media 7(2), 187–199 (2021)

    Google Scholar 

  9. Zhao, H., Jiang, L., Jia, J., Torr, P.H., Koltun, V., et al.: Point transformer. In: Proceedings International Conference Computer Vision, pp. 16259–16268 (2021)

    Google Scholar 

  10. Lai, X., et al.: Stratified transformer for 3D point cloud segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pp. 8500–8509 (2022)

    Google Scholar 

  11. Armeni, I., et al.: 3D semantic parsing of large-scale indoor spaces. In: Proceedings IEEE Conerence Computer Vision Pattern Recognition, pp. 1534–1543 (2016)

    Google Scholar 

  12. Boulch, A., Guerry, J., Le Saux, B., Audebert, N.: SnapNet: 3D point cloud semantic labeling with 2D deep segmentation networks. Comput. Graph. 71, 189–198 (2018)

    Article  Google Scholar 

  13. Wu, B., Wan, A., Yue, X., Keutzer, K.: SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud. In: Proceedings IEEE International Conference on Robotics Automation (ICRA), Brisbane, QLD, Australia, 2018, pp. 1887–1893 (2018)

    Google Scholar 

  14. Wu, B., Zhou, X., Zhao, S., Yue, X., Keutzer, K.: SqueezeSegV2: improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. In: Proceedings IEEE International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 2019, pp. 4376–4382 (2019)

    Google Scholar 

  15. Tchapmi, L., Choy, C., Armeni, I., Gwak, J., Savarese, S.: SEGCloud: semantic segmentation of 3d point clouds. In: Proceedings Inernational Conference 3D Vision (3DV), Qingdao, China, 2017, pp. 537–547 (2017)

    Google Scholar 

  16. Liu, Z., Tang, H., Lin, Y., Han, S.: Point-voxel CNN for efficient 3D deep learning. In: Advances in Neural Information Processing Systems, Cambridge, MA, USA:MIT Press, 2019, pp. 965–975 (2019)

    Google Scholar 

  17. Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st International Conference Neural Information Processing System, pp. 5998–6008 (2017)

    Google Scholar 

  18. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  19. Paszke, A., et al.: PyTorch: an imperative style high-performance deep learning library. In: Proceedings Advance Neural Information Processing System, pp. 8026–8037 (2019)

    Google Scholar 

  20. Lin, Y., et al.: PointCNN: Convolution on \(\chi \) -transformed points. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 826–836 (2018)

    Google Scholar 

  21. Yang, J., et al.: Modeling point clouds with self-attention and gumbel subset sampling. In: Proceedings of the IEEE/CVF Conference Computer Vision Pattern Recognition, pp. 3323–3332 (2019)

    Google Scholar 

  22. Thomas, H., Qi, C.R., Deschaud, J.E., Marcotegui, B., Goulette, F., Guibas, L.J.: KPConv: flexible and Deformable Convolution for Point Clouds. In: Proceedings of IEEE International Conference Computer Vision (ICCV), 2019, pp. 6411–6420 (2019)

    Google Scholar 

  23. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference Computer Vision Pattern Recognition, pp. 2818–2826. (2016)

    Google Scholar 

Download references

Acknowledgement

This work was supported in part by the Heilongjiang Provincial Science and Technology Program under Grant 2022ZX01A16, and in part by the Sichuan Science and Technology Program under Grant 2022YFG0148.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiang Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, Q., Lu, J., Li, Q., Huang, B. (2024). High-Resolution Self-attention with Fair Loss for Point Cloud Segmentation. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Lecture Notes in Computer Science, vol 14451. Springer, Singapore. https://doi.org/10.1007/978-981-99-8073-4_27

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8073-4_27

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8072-7

  • Online ISBN: 978-981-99-8073-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics