Skip to main content
Log in

Flexible asymmetric convolutional attention network for LiDAR semantic

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

LiDAR semantic segmentation is an essential task in understanding 3D semantic information. Currently, the most efficient approach to LiDAR data segmentation is to project the point cloud into the 2D plane and process it using 2D convolution. The results of this approach are encouraging. However, the elevation angle of LiDAR is larger than the azimuth angle, resulting in the range map being vertically elongated in the 3D space captured per unit pixel area. If a square convolution kernel is used, the extracted features will be distorted. To address these limitations, we propose the flexible asymmetric convolutional attention network (FACANet), built from flexible asymmetric convolution and lightweight decoding modules. In this encoder structure, a meta-kernel accounts for the geometric information in 3D space, which helps encode the input range image features effectively. Moreover, a flexible asymmetric convolutional attention block (FACAB) is proposed to capture elongated features in the range image. To facilitate lightweight decoding, the channel uniform interpolation block (CUIB) uses \(1\times 1\) convolutions to reduce channels and bilinear interpolation to upsample features at each resolution. Furthermore, the continuous multiscale feature fusion block (CMFB) is proposed to fuse features at different resolutions. Finally, a convolutional spatial propagation network (CSPN)-based segmentation head is introduced to improve the accuracy of the segmentation results. Quantitative and qualitative experiments are conducted on the public datasets SemanticKITTI and SemanticPOSS, and our approach achieves better accuracy than advanced models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

The authors declare that all data and materials support our claims in the manuscript and comply with field standards. And the original data involved in our research is available.

References

  1. Tao C, Bian W, Wang C, Li H, Gao Z, Zhang Z, Zheng S, Zhu Y (2023) 3d object detection algorithm based on multi-sensor segmental fusion of frustum association for autonomous driving. Appl Intell, pp 1–22

  2. Jhaldiyal A, Chaudhary N (2023) Semantic segmentation of 3d lidar data using deep learning: a review of projection-based methods. Appl Intell 53(6):6844–6855

    Article  Google Scholar 

  3. Liang P, Fang Z, Huang B, Zhou H, Tang X, Zhong C (2021) Pointfusionnet: point feature fusion network for 3d point clouds analysis. Appl Intell 51:2063–2076

    Article  Google Scholar 

  4. Chen S, Miao Z, Chen H, Mukherjee M, Zhang Y (2023) Point-attention net: a graph attention convolution network for point cloudsegmentation. Appl Intell 53(9):11344–11356

    Article  Google Scholar 

  5. Behley J, Garbade M, Milioto A, Quenzel J, Behnke S, Stachniss C, Gall J (2019) Semantickitti: a dataset for semantic scene understanding of lidar sequences. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9297–9307

  6. Pan Y, Gao B, Mei J, Geng S, Li C, Zhao H (2020) Semanticposs: a point cloud dataset with large quantity of dynamic instances. In: 2020 IEEE intelligent vehicles symposium (IV), pp 687–693. IEEE

  7. Xiao P, Shao Z, Hao S, Zhang Z, Chai X, Jiao J, Li Z, Wu J, Sun K, Jiang K et al (2021) Pandaset: advanced sensor suite dataset for autonomous driving. In: 2021 IEEE international intelligent transportation systems conference (ITSC), pp 3095–3101. IEEE

  8. Tang H, Liu Z, Zhao S, Lin Y, Lin J, Wang H, Han S (2020) Searching efficient 3d architectures with sparse point-voxel convolution. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVIII, pp 685–702. Springer

  9. Zhu X, Zhou H, Wang T, Hong F, Ma Y, Li W, Li H, Lin D (2021) Cylindrical and asymmetrical 3d convolution networks for lidar segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9939–9948

  10. Milioto A, Vizzo I, Behley J, Stachniss C (2019) Rangenet++: fast and accurate lidar semantic segmentation. In: 2019 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 4213–4220. IEEE

  11. Zhao Y, Bai L, Huang X (2021) Fidnet: Lidar point cloud semantic segmentation with fully interpolation decoding. In: 2021 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 4453–4458. IEEE

  12. Aksoy EE, Baci S, Cavdar S (2020) Salsanet: fast road and vehicle segmentation in lidar point clouds for autonomous driving. In: 2020 IEEE intelligent vehicles symposium (IV), pp 926–932. IEEE

  13. Cortinhal T, Tzelepis G, Erdal Aksoy E (2020) Salsanext: fast, uncertainty-aware semantic segmentation of lidar point clouds. In: Advances in visual computing: 15th international symposium, ISVC 2020, San Diego, CA, USA, October 5–7, 2020, Proceedings, Part II 15, pp 207–222. Springer

  14. Wu G, Ning X, Hou L, He F, Zhang H, Shankar A (2023) Three-dimensional softmax mechanism guided bidirectional gru networks for hyperspectral remote sensing image classification. Signal Proc 212:109151

    Article  Google Scholar 

  15. Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 652–660

  16. Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. Advances in Neural Information Processing Systems 30

  17. Hu Q, Yang B, Xie L, Rosa S, Guo Y, Wang Z, Trigoni N, Markham A (2020) Randla-net: efficient semantic segmentation of large-scale point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11108–11117

  18. Liu K, Gao Z, Lin F, Chen BM (2022) Fg-net: a fast and accurate framework for large-scale lidar point cloud understanding. IEEE Trans Cybern 53(1):553–564

    Article  Google Scholar 

  19. Graham B, Engelcke M, Van Der Maaten L (2018) 3d semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9224–9232

  20. Zhao L, Xu S, Liu L, Ming D, Tao W (2022) Svaseg: sparse voxel-based attention for 3d lidar point cloud semantic segmentation. Remote Sens 14(18):4471

    Article  Google Scholar 

  21. Wu B, Wan A, Yue X, Keutzer K (2018) Squeezeseg: convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud. In: 2018 IEEE international conference on robotics and automation (ICRA), pp 1887–1893. IEEE

  22. Wu B, Zhou X, Zhao S, Yue X, Keutzer K (2019) Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. In: 2019 International conference on robotics and automation (ICRA), pp 4376–4382. IEEE

  23. Xu C, Wu B, Wang Z, Zhan W, Vajda P, Keutzer K, Tomizuka M (2020) Squeezesegv3: spatially-adaptive convolution for efficient point-cloud segmentation. In: Computer vision–ECCV 2020: 16th european conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVIII 16, pp 1–19. Springer

  24. Alonso I, Riazuelo L, Montesano L, Murillo AC (2020) 3d-mininet: learning a 2d representation from point clouds for fast and efficient 3d lidar semantic segmentation. IEEE Robot Autom Lett 5(4):5432–5439

    Article  Google Scholar 

  25. Li S, Chen X, Liu Y, Dai D, Stachniss C, Gall J (2021) Multi-scale interaction for real-time lidar data segmentation on an embedded platform. IEEE Robot Autom Lett 7(2):738–745

    Article  Google Scholar 

  26. Li S, Liu Y, Gall J (2021) Rethinking 3-d lidar point cloud segmentation. IEEE Transactions on Neural Networks and Learning Systems

  27. Lee J-S, Park T-H (2022) Transformable dilated convolution by distance for lidar semantic segmentation. IEEE Access 10:125102–125111

    Article  Google Scholar 

  28. Song W, Liu Z, Guo Y, Sun S, Zu G, Li M (2022) Dgpolarnet: dynamic graph convolution network for lidar point cloud semantic segmentation on polar bev. Remote Sens 14(15). https://doi.org/10.3390/rs14153825

  29. Cheng H-X, Han X-F, Xiao G-Q (2023) Transrvnet: Lidar semantic segmentation with transformer. IEEE Transactions on Intelligent Transportation Systems

  30. Ando A, Gidaris S, Bursuc A, Puy G, Boulch A, Marlet R (2023) Rangevit: towards vision transformers for 3d semantic segmentation in autonomous driving. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5240–5250

  31. Fan L, Xiong X, Wang F, Wang N, Zhang Z (2021) Rangedet: in defense of range view for lidar-based 3d object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2918–2927

  32. Cheng X, Wang P, Yang R (2019) Learning depth with convolutional spatial propagation network. IEEE Trans Pattern Anal Mach Intell 42(10):2361–2379

    Article  Google Scholar 

  33. Liu S, De Mello S, Gu J, Zhong G, Yang M-H, Kautz J (2017) Learning affinity via spatial propagation networks. Advances in Neural Information Processing Systems 30

  34. Zhang H, Wen B, Zha Z, Zhang B, Tang Y, Yu G, Du W (2023) Accelerated palm for nonconvex low-rank matrix recovery with theoretical analysis. IEEE Transactions on Circuits and Systems for Video Technology

  35. Berman M, Triki AR, Blaschko M B (2018) The lovász-softmax loss: a tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4413–4421

  36. Bokhovkin A, Burnaev E (2019) Boundary loss for remote sensing imagery semantic segmentation. In: Advances in neural networks–ISNN 2019: 16th international symposium on neural networks, ISNN 2019, Moscow, Russia, July 10–12, 2019, Proceedings, Part II 16, pp 388–401. Springer

  37. Landrieu L, Simonovsky M (2018) Large-scale point cloud semantic segmentation with superpoint graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4558–4567

  38. Su H, Jampani V, Sun D, Maji S, Kalogerakis E, Yang M-H, Kautz J (2018) Splatnet: sparse lattice networks for point cloud processing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2530–2539

  39. Tatarchenko M, Park J, Koltun V, Zhou Q-Y (2018) Tangent convolutions for dense prediction in 3d. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3887–3896

  40. Zhang Y, Zhou Z, David P, Yue X, Xi Z, Gong B, Foroosh H (2020) Polarnet: an improved grid representation for online lidar point clouds semantic segmentation. In: Proceedings of the  IEEE/CVF conference on computer vision and pattern recognition, pp 9601–9610

  41. Thomas H, Qi CR, Deschaud J-E, Marcotegui B, Goulette F, Guibas LJ (2019) Kpconv: flexible and deformable convolution for point clouds. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6411–6420

  42. Choy C, Gwak J, Savarese S (2019) 4d spatio-temporal convnets: Minkowski convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3075–3084

  43. Qiu S, Anwar S, Barnes N (2021) Semantic segmentation for real point cloud scenes via bilateral augmentation and adaptive fusion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1757–1767

  44. Yan X, Gao J, Li J, Zhang R, Li Z, Huang R, Cui S (2021) Sparse single sweep lidar point cloud segmentation via learning contextual shape priors from scene completion. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 3101–3109

  45. Xu J, Zhang R, Dou J, Zhu Y, Sun J, Pu S (2021) Rpvnet: a deep and efficient range-point-voxel fusion network for lidar point cloud segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 16024–16033

  46. Park J, Kim C, Kim S, Jo K (2023) Pcscnet: fast 3d semantic segmentation of lidar point cloud for autonomous car using point convolution and sparse convolution network. Expert Syst Appl 212. https://doi.org/10.1016/j.eswa.2022.118815

  47. Gerdzhev M, Razani R, Taghavi E, Bingbing L (2021) Tornado-net: multiview total variation semantic segmentation with diamond inception module, vol 2021-May. Xi’an, China, pp 9543–9549. ’current;Autonomous driving;Encoder-decoder;Features extraction;Multi-views;Neural-networks;Point-clouds;Scene understanding;Semantic segmentation;Total-variation. https://doi.org/10.1109/ICRA48506.2021.9562041

  48. Wang S, Zhu J, Zhang R (2022) Meta-rangeseg: Lidar sequence semantic segmentation using multiple feature aggregation. IEEE Robot Autom Lett 7(4):9739–9746

    Article  Google Scholar 

  49. Kochanov D, Nejadasl FK, Booij O (2020) KPRNet: Improving projection-based LiDAR semantic segmentation. 2D projections; Autonomous Vehicles; Convolutional neural network; LiDAR; Neural network architecture; Perception systems; Point cloud segmentation; Point-clouds; Projection method; Semantic segmentation

  50. Cheng H-X, Han X-F, Xiao G-Q (2022) Cenet: toward concise and efficient lidar semantic segmentation for autonomous driving. In: 2022 IEEE international conference on multimedia and expo (ICME), pp 01–06. IEEE

Download references

Funding

This work is supported by the National High Technology Research and Development Program of China under Grant No. 2018YFE0204300, the Fundamental Research Funds for the Central Universities under Grant No.800015Z1413, the Top Innovative Talents Cultivation Fund for doctoral students under Grant No. BBJ2023039.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: Jianwang Gan, Guoying Zhang; Methodology: Jianwang Gan, Guoying Zhang; Formal analysis and investigation: Jianwang Gan, Yijing Xiong, Kangkang Kou; Writing original draft preparation: Jianwang Gan; Writing review and editing: Guoying Zhang, Yijing Xiong, Kangkang Kou.

Corresponding author

Correspondence to Guoying Zhang.

Ethics declarations

Conflicts of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Ethics approval

Compliance with ethical standards.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gan, J., Zhang, G., Kou, K. et al. Flexible asymmetric convolutional attention network for LiDAR semantic. Appl Intell 54, 6718–6737 (2024). https://doi.org/10.1007/s10489-024-05525-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-024-05525-8

Keywords