Skip to main content

Efficient Point Cloud Segmentation with Geometry-Aware Sparse Networks

  • Conference paper
  • First Online:
Book cover Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13699))

Included in the following conference series:

Abstract

In point cloud learning, sparsity and geometry are two core properties. Recently, many approaches have been proposed through single or multiple representations to improve the performance of point cloud semantic segmentation. However, these works fail to maintain the balance among performance, efficiency, and memory consumption, showing incapability to integrate sparsity and geometry appropriately. To address these issues, we propose the Geometry-aware Sparse Networks (GASN) by utilizing the sparsity and geometry of a point cloud in a single voxel representation. GASN mainly consists of two modules, namely Sparse Feature Encoder and Sparse Geometry Feature Enhancement. The Sparse Feature Encoder extracts the local context information, and the Sparse Geometry Feature Enhancement enhances the geometric properties of a sparse point cloud to improve both efficiency and performance. In addition, we propose deep sparse supervision in the training phase to help convergence and alleviate the memory consumption problem. Our GASN achieves state-of-the-art performance on both SemanticKITTI and Nuscenes datasets while running significantly faster and consuming less memory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Behley, J., et al.: Semantickitti: a dataset for semantic scene understanding of lidar sequences. In: ICCV, pp. 9297–9307 (2019)

    Google Scholar 

  2. Berman, M., Rannen Triki, A., Blaschko, M.B.: The lovász-softmax loss: a tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In: CVPR, pp. 4413–4421 (2018)

    Google Scholar 

  3. Caesar, H., et al.: nuscenes: a multimodal dataset for autonomous driving. In: CVPR, pp. 11621–11631 (2020)

    Google Scholar 

  4. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Patt. Anal. Mach. Intell. 40(4), 834–848 (2017)

    Article  Google Scholar 

  5. Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)

  6. Cheng, R., Razani, R., Taghavi, E., Li, E., Liu, B.: 2–s3net: attentive feature fusion with adaptive feature selection for sparse semantic segmentation network. In: CVPR, pp. 12547–12556 (2021)

    Google Scholar 

  7. Choy, C., Gwak, J., Savarese, S.: 4D spatio-temporal convnets: minkowski convolutional neural networks. In: CVPR, pp. 3075–3084 (2019)

    Google Scholar 

  8. Cortinhal, T., Tzelepis, G., Erdal Aksoy, E.: SalsaNext: fast, uncertainty-aware semantic segmentation of LiDAR point clouds. In: Bebis, G., et al. (eds.) ISVC 2020. LNCS, vol. 12510, pp. 207–222. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64559-5_16

    Chapter  Google Scholar 

  9. Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the kitti dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)

    Article  Google Scholar 

  10. Graham, B., Engelcke, M., Maaten, L.V.D.: 3D semantic segmentation with submanifold sparse convolutional networks. In: CVPR (2018)

    Google Scholar 

  11. Guo, M.H., Cai, J.X., Liu, Z.N., Mu, T.J., Martin, R.R., Hu, S.M.: PCT: point cloud transformer. arXiv preprint arXiv:2012.09688 (2020)

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  13. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR, pp. 7132–7141 (2018)

    Google Scholar 

  14. Hu, Q., et al.: Randla-net: efficient semantic segmentation of large-scale point clouds. In: CVPR, pp. 11108–11117 (2020)

    Google Scholar 

  15. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  16. Kumawat, S., Raman, S.: LP-3DCNN: unveiling local phase in 3D convolutional neural networks. In: CVPR, pp. 4903–4912 (2019)

    Google Scholar 

  17. Le, T., Duan, Y.: Pointgrid: a deep network for 3D shape understanding. In: CVPR, pp. 9204–9214 (2018)

    Google Scholar 

  18. Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks. In: CVPR, pp. 510–519 (2019)

    Google Scholar 

  19. Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: Pointcnn: convolution on x-transformed points. Adv. Neural. Inf. Process. Syst. 31, 820–830 (2018)

    Google Scholar 

  20. Liu, Z., Hu, H., Cao, Y., Zhang, Z., Tong, X.: A closer look at local aggregation operators in point cloud analysis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12368, pp. 326–342. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58592-1_20

    Chapter  Google Scholar 

  21. Liu, Z., Tang, H., Lin, Y., Han, S.: Point-voxel CNN for efficient 3D deep learning. In: Advances in Neural Information Processing Systems, pp. 965–975 (2019)

    Google Scholar 

  22. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)

    Google Scholar 

  23. Loquercio, A., Dosovitskiy, A., Scaramuzza, D.: Learning depth with very sparse supervision. IEEE Robot. Autom. Lett. 5(4), 5542–5549 (2020)

    Article  Google Scholar 

  24. Maas, A.L., Hannun, A.Y., Ng, A.Y., et al.: Rectifier nonlinearities improve neural network acoustic models. In: Proc. icml, vol. 30, p. 3. Citeseer (2013)

    Google Scholar 

  25. Maturana, D., Scherer, S.: Voxnet: a 3D convolutional neural network for real-time object recognition. In: IROS, pp. 922–928. IEEE (2015)

    Google Scholar 

  26. Milioto, A., Vizzo, I., Behley, J., Stachniss, C.: Rangenet++: fast and accurate lidar semantic segmentation. In: IROS, pp. 4213–4220. IEEE (2019)

    Google Scholar 

  27. Milioto, A., Vizzo, I., Behley, J., Stachniss, C.: Rangenet++: fast and accurate lidar semantic segmentation. In: IROS, pp. 4213–4220. IEEE (2019)

    Google Scholar 

  28. Pham, Q.H., Nguyen, T., Hua, B.S., Roig, G., Yeung, S.K.: Jsis3d: joint semantic-instance segmentation of 3D point clouds with multi-task pointwise networks and multi-value conditional random fields. In: CVPR, pp. 8827–8836 (2019)

    Google Scholar 

  29. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: CVPR, pp. 652–660 (2017)

    Google Scholar 

  30. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems 30, vol. 30, pp. 5099–5108. Curran Associates, Inc. (2017)

    Google Scholar 

  31. Qi, X., Liao, R., Jia, J., Fidler, S., Urtasun, R.: 3D graph neural networks for RGBD semantic segmentation. In: ICCV, pp. 5199–5208 (2017)

    Google Scholar 

  32. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  33. Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: ICCV, pp. 945–953 (2015)

    Google Scholar 

  34. Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: CVPR, pp. 5693–5703 (2019)

    Google Scholar 

  35. Tang, H., et al.: Searching efficient 3D architectures with sparse point-voxel convolution. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12373, pp. 685–702. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58604-1_41

    Chapter  Google Scholar 

  36. Tatarchenko, M., Park, J., Koltun, V., Zhou, Q.Y.: Tangent convolutions for dense prediction in 3D. In: CVPR, pp. 3887–3896 (2018)

    Google Scholar 

  37. Te, G., Hu, W., Zheng, A., Guo, Z.: Rgcnn: regularized graph CNN for point cloud segmentation. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 746–754 (2018)

    Google Scholar 

  38. Thomas, H., Qi, C.R., Deschaud, J.E., Marcotegui, B., Goulette, F., Guibas, L.J.: Kpconv: flexible and deformable convolution for point clouds. In: ICCV, pp. 6411–6420 (2019)

    Google Scholar 

  39. Unal, O., Van Gool, L., Dai, D.: Improving point cloud semantic segmentation by learning 3D object detection. In: WACV, pp. 2950–2959 (2021)

    Google Scholar 

  40. Wang, C., Samari, B., Siddiqi, K.: Local spectral graph convolution for point set feature learning. In: ECCV, pp. 52–66 (2018)

    Google Scholar 

  41. Wang, L., Huang, Y., Hou, Y., Zhang, S., Shan, J.: Graph attention convolution for point cloud semantic segmentation. In: CVPR, pp. 10296–10305 (2019)

    Google Scholar 

  42. Wang, P.S., Liu, Y., Guo, Y.X., Sun, C.Y., Tong, X.: O-cnn: octree-based convolutional neural networks for 3D shape analysis. ACM Trans. Graph. (TOG) 36(4), 1–11 (2017)

    Google Scholar 

  43. Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (ToG) 38(5), 1–12 (2019)

    Article  Google Scholar 

  44. Wu, B., Wan, A., Yue, X., Keutzer, K.: Squeezeseg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D lidar point cloud. In: ICRA, pp. 1887–1893. IEEE (2018)

    Google Scholar 

  45. Wu, W., Qi, Z., Fuxin, L.: Pointconv: deep convolutional networks on 3D point clouds. In: CVPR, pp. 9621–9630 (2019)

    Google Scholar 

  46. Xu, C., et al.: SqueezeSegV3: spatially-adaptive convolution for efficient point-cloud segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12373, pp. 1–19. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58604-1_1

    Chapter  Google Scholar 

  47. Xu, J., Zhang, R., Dou, J., Zhu, Y., Sun, J., Pu, S.: Rpvnet: a deep and efficient range-point-voxel fusion network for lidar point cloud segmentation. arXiv preprint arXiv:2103.12978 (2021)

  48. Xu, Q., Sun, X., Wu, C.Y., Wang, P., Neumann, U.: Grid-GCN for fast and scalable point cloud learning. In: CVPR, pp. 5661–5670 (2020)

    Google Scholar 

  49. Yan, X., et al.: Sparse single sweep lidar point cloud segmentation via learning contextual shape priors from scene completion. arXiv preprint arXiv:2012.03762 (2020)

  50. Yan, Y., Mao, Y., Li, B.: Second: sparsely embedded convolutional detection. Sensors 18(10), 3337 (2018)

    Article  Google Scholar 

  51. Yang, M., Yu, K., Zhang, C., Li, Z., Yang, K.: Denseaspp for semantic segmentation in street scenes. In: CVPR, pp. 3684–3692 (2018)

    Google Scholar 

  52. Ye, M., Xu, S., Cao, T., Chen, Q.: Drinet: a dual-representation iterative learning network for point cloud segmentation. In: ICCV (2021)

    Google Scholar 

  53. Zhang, K., Hao, M., Wang, J., de Silva, C.W., Fu, C.: Linked dynamic graph CNN: learning on point cloud via linking hierarchical features. arXiv preprint arXiv:1904.10014 (2019)

  54. Zhang, Y., et al.: Polarnet: an improved grid representation for online lidar point clouds semantic segmentation. In: CVPR, pp. 9601–9610 (2020)

    Google Scholar 

  55. Zhang, Y., Rabbat, M.: A graph-cnn for 3D point cloud classification. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6279–6283. IEEE (2018)

    Google Scholar 

  56. Zhao, H., Jiang, L., Jia, J., Torr, P., Koltun, V.: Point transformer. arXiv preprint arXiv:2012.09164 (2020)

  57. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR, pp. 2881–2890 (2017)

    Google Scholar 

  58. Zhou, Y., et al.: End-to-end multi-view fusion for 3D object detection in lidar point clouds. In: Conference on Robot Learning, pp. 923–932. PMLR (2020)

    Google Scholar 

  59. Zhu, X., et al.: Cylindrical and asymmetrical 3D convolution networks for lidar segmentation. In: CVPR, pp. 9939–9948 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qifeng Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ye, M., Wan, R., Xu, S., Cao, T., Chen, Q. (2022). Efficient Point Cloud Segmentation with Geometry-Aware Sparse Networks. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13699. Springer, Cham. https://doi.org/10.1007/978-3-031-19842-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19842-7_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19841-0

  • Online ISBN: 978-3-031-19842-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics