Abstract
LiDAR semantic segmentation is an essential task in understanding 3D semantic information. Currently, the most efficient approach to LiDAR data segmentation is to project the point cloud into the 2D plane and process it using 2D convolution. The results of this approach are encouraging. However, the elevation angle of LiDAR is larger than the azimuth angle, resulting in the range map being vertically elongated in the 3D space captured per unit pixel area. If a square convolution kernel is used, the extracted features will be distorted. To address these limitations, we propose the flexible asymmetric convolutional attention network (FACANet), built from flexible asymmetric convolution and lightweight decoding modules. In this encoder structure, a meta-kernel accounts for the geometric information in 3D space, which helps encode the input range image features effectively. Moreover, a flexible asymmetric convolutional attention block (FACAB) is proposed to capture elongated features in the range image. To facilitate lightweight decoding, the channel uniform interpolation block (CUIB) uses \(1\times 1\) convolutions to reduce channels and bilinear interpolation to upsample features at each resolution. Furthermore, the continuous multiscale feature fusion block (CMFB) is proposed to fuse features at different resolutions. Finally, a convolutional spatial propagation network (CSPN)-based segmentation head is introduced to improve the accuracy of the segmentation results. Quantitative and qualitative experiments are conducted on the public datasets SemanticKITTI and SemanticPOSS, and our approach achieves better accuracy than advanced models.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The authors declare that all data and materials support our claims in the manuscript and comply with field standards. And the original data involved in our research is available.
References
Tao C, Bian W, Wang C, Li H, Gao Z, Zhang Z, Zheng S, Zhu Y (2023) 3d object detection algorithm based on multi-sensor segmental fusion of frustum association for autonomous driving. Appl Intell, pp 1–22
Jhaldiyal A, Chaudhary N (2023) Semantic segmentation of 3d lidar data using deep learning: a review of projection-based methods. Appl Intell 53(6):6844–6855
Liang P, Fang Z, Huang B, Zhou H, Tang X, Zhong C (2021) Pointfusionnet: point feature fusion network for 3d point clouds analysis. Appl Intell 51:2063–2076
Chen S, Miao Z, Chen H, Mukherjee M, Zhang Y (2023) Point-attention net: a graph attention convolution network for point cloudsegmentation. Appl Intell 53(9):11344–11356
Behley J, Garbade M, Milioto A, Quenzel J, Behnke S, Stachniss C, Gall J (2019) Semantickitti: a dataset for semantic scene understanding of lidar sequences. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9297–9307
Pan Y, Gao B, Mei J, Geng S, Li C, Zhao H (2020) Semanticposs: a point cloud dataset with large quantity of dynamic instances. In: 2020 IEEE intelligent vehicles symposium (IV), pp 687–693. IEEE
Xiao P, Shao Z, Hao S, Zhang Z, Chai X, Jiao J, Li Z, Wu J, Sun K, Jiang K et al (2021) Pandaset: advanced sensor suite dataset for autonomous driving. In: 2021 IEEE international intelligent transportation systems conference (ITSC), pp 3095–3101. IEEE
Tang H, Liu Z, Zhao S, Lin Y, Lin J, Wang H, Han S (2020) Searching efficient 3d architectures with sparse point-voxel convolution. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVIII, pp 685–702. Springer
Zhu X, Zhou H, Wang T, Hong F, Ma Y, Li W, Li H, Lin D (2021) Cylindrical and asymmetrical 3d convolution networks for lidar segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9939–9948
Milioto A, Vizzo I, Behley J, Stachniss C (2019) Rangenet++: fast and accurate lidar semantic segmentation. In: 2019 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 4213–4220. IEEE
Zhao Y, Bai L, Huang X (2021) Fidnet: Lidar point cloud semantic segmentation with fully interpolation decoding. In: 2021 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 4453–4458. IEEE
Aksoy EE, Baci S, Cavdar S (2020) Salsanet: fast road and vehicle segmentation in lidar point clouds for autonomous driving. In: 2020 IEEE intelligent vehicles symposium (IV), pp 926–932. IEEE
Cortinhal T, Tzelepis G, Erdal Aksoy E (2020) Salsanext: fast, uncertainty-aware semantic segmentation of lidar point clouds. In: Advances in visual computing: 15th international symposium, ISVC 2020, San Diego, CA, USA, October 5–7, 2020, Proceedings, Part II 15, pp 207–222. Springer
Wu G, Ning X, Hou L, He F, Zhang H, Shankar A (2023) Three-dimensional softmax mechanism guided bidirectional gru networks for hyperspectral remote sensing image classification. Signal Proc 212:109151
Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 652–660
Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. Advances in Neural Information Processing Systems 30
Hu Q, Yang B, Xie L, Rosa S, Guo Y, Wang Z, Trigoni N, Markham A (2020) Randla-net: efficient semantic segmentation of large-scale point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11108–11117
Liu K, Gao Z, Lin F, Chen BM (2022) Fg-net: a fast and accurate framework for large-scale lidar point cloud understanding. IEEE Trans Cybern 53(1):553–564
Graham B, Engelcke M, Van Der Maaten L (2018) 3d semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9224–9232
Zhao L, Xu S, Liu L, Ming D, Tao W (2022) Svaseg: sparse voxel-based attention for 3d lidar point cloud semantic segmentation. Remote Sens 14(18):4471
Wu B, Wan A, Yue X, Keutzer K (2018) Squeezeseg: convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud. In: 2018 IEEE international conference on robotics and automation (ICRA), pp 1887–1893. IEEE
Wu B, Zhou X, Zhao S, Yue X, Keutzer K (2019) Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. In: 2019 International conference on robotics and automation (ICRA), pp 4376–4382. IEEE
Xu C, Wu B, Wang Z, Zhan W, Vajda P, Keutzer K, Tomizuka M (2020) Squeezesegv3: spatially-adaptive convolution for efficient point-cloud segmentation. In: Computer vision–ECCV 2020: 16th european conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVIII 16, pp 1–19. Springer
Alonso I, Riazuelo L, Montesano L, Murillo AC (2020) 3d-mininet: learning a 2d representation from point clouds for fast and efficient 3d lidar semantic segmentation. IEEE Robot Autom Lett 5(4):5432–5439
Li S, Chen X, Liu Y, Dai D, Stachniss C, Gall J (2021) Multi-scale interaction for real-time lidar data segmentation on an embedded platform. IEEE Robot Autom Lett 7(2):738–745
Li S, Liu Y, Gall J (2021) Rethinking 3-d lidar point cloud segmentation. IEEE Transactions on Neural Networks and Learning Systems
Lee J-S, Park T-H (2022) Transformable dilated convolution by distance for lidar semantic segmentation. IEEE Access 10:125102–125111
Song W, Liu Z, Guo Y, Sun S, Zu G, Li M (2022) Dgpolarnet: dynamic graph convolution network for lidar point cloud semantic segmentation on polar bev. Remote Sens 14(15). https://doi.org/10.3390/rs14153825
Cheng H-X, Han X-F, Xiao G-Q (2023) Transrvnet: Lidar semantic segmentation with transformer. IEEE Transactions on Intelligent Transportation Systems
Ando A, Gidaris S, Bursuc A, Puy G, Boulch A, Marlet R (2023) Rangevit: towards vision transformers for 3d semantic segmentation in autonomous driving. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5240–5250
Fan L, Xiong X, Wang F, Wang N, Zhang Z (2021) Rangedet: in defense of range view for lidar-based 3d object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2918–2927
Cheng X, Wang P, Yang R (2019) Learning depth with convolutional spatial propagation network. IEEE Trans Pattern Anal Mach Intell 42(10):2361–2379
Liu S, De Mello S, Gu J, Zhong G, Yang M-H, Kautz J (2017) Learning affinity via spatial propagation networks. Advances in Neural Information Processing Systems 30
Zhang H, Wen B, Zha Z, Zhang B, Tang Y, Yu G, Du W (2023) Accelerated palm for nonconvex low-rank matrix recovery with theoretical analysis. IEEE Transactions on Circuits and Systems for Video Technology
Berman M, Triki AR, Blaschko M B (2018) The lovász-softmax loss: a tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4413–4421
Bokhovkin A, Burnaev E (2019) Boundary loss for remote sensing imagery semantic segmentation. In: Advances in neural networks–ISNN 2019: 16th international symposium on neural networks, ISNN 2019, Moscow, Russia, July 10–12, 2019, Proceedings, Part II 16, pp 388–401. Springer
Landrieu L, Simonovsky M (2018) Large-scale point cloud semantic segmentation with superpoint graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4558–4567
Su H, Jampani V, Sun D, Maji S, Kalogerakis E, Yang M-H, Kautz J (2018) Splatnet: sparse lattice networks for point cloud processing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2530–2539
Tatarchenko M, Park J, Koltun V, Zhou Q-Y (2018) Tangent convolutions for dense prediction in 3d. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3887–3896
Zhang Y, Zhou Z, David P, Yue X, Xi Z, Gong B, Foroosh H (2020) Polarnet: an improved grid representation for online lidar point clouds semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9601–9610
Thomas H, Qi CR, Deschaud J-E, Marcotegui B, Goulette F, Guibas LJ (2019) Kpconv: flexible and deformable convolution for point clouds. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6411–6420
Choy C, Gwak J, Savarese S (2019) 4d spatio-temporal convnets: Minkowski convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3075–3084
Qiu S, Anwar S, Barnes N (2021) Semantic segmentation for real point cloud scenes via bilateral augmentation and adaptive fusion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1757–1767
Yan X, Gao J, Li J, Zhang R, Li Z, Huang R, Cui S (2021) Sparse single sweep lidar point cloud segmentation via learning contextual shape priors from scene completion. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 3101–3109
Xu J, Zhang R, Dou J, Zhu Y, Sun J, Pu S (2021) Rpvnet: a deep and efficient range-point-voxel fusion network for lidar point cloud segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 16024–16033
Park J, Kim C, Kim S, Jo K (2023) Pcscnet: fast 3d semantic segmentation of lidar point cloud for autonomous car using point convolution and sparse convolution network. Expert Syst Appl 212. https://doi.org/10.1016/j.eswa.2022.118815
Gerdzhev M, Razani R, Taghavi E, Bingbing L (2021) Tornado-net: multiview total variation semantic segmentation with diamond inception module, vol 2021-May. Xi’an, China, pp 9543–9549. ’current;Autonomous driving;Encoder-decoder;Features extraction;Multi-views;Neural-networks;Point-clouds;Scene understanding;Semantic segmentation;Total-variation. https://doi.org/10.1109/ICRA48506.2021.9562041
Wang S, Zhu J, Zhang R (2022) Meta-rangeseg: Lidar sequence semantic segmentation using multiple feature aggregation. IEEE Robot Autom Lett 7(4):9739–9746
Kochanov D, Nejadasl FK, Booij O (2020) KPRNet: Improving projection-based LiDAR semantic segmentation. 2D projections; Autonomous Vehicles; Convolutional neural network; LiDAR; Neural network architecture; Perception systems; Point cloud segmentation; Point-clouds; Projection method; Semantic segmentation
Cheng H-X, Han X-F, Xiao G-Q (2022) Cenet: toward concise and efficient lidar semantic segmentation for autonomous driving. In: 2022 IEEE international conference on multimedia and expo (ICME), pp 01–06. IEEE
Funding
This work is supported by the National High Technology Research and Development Program of China under Grant No. 2018YFE0204300, the Fundamental Research Funds for the Central Universities under Grant No.800015Z1413, the Top Innovative Talents Cultivation Fund for doctoral students under Grant No. BBJ2023039.
Author information
Authors and Affiliations
Contributions
Conceptualization: Jianwang Gan, Guoying Zhang; Methodology: Jianwang Gan, Guoying Zhang; Formal analysis and investigation: Jianwang Gan, Yijing Xiong, Kangkang Kou; Writing original draft preparation: Jianwang Gan; Writing review and editing: Guoying Zhang, Yijing Xiong, Kangkang Kou.
Corresponding author
Ethics declarations
Conflicts of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Ethics approval
Compliance with ethical standards.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gan, J., Zhang, G., Kou, K. et al. Flexible asymmetric convolutional attention network for LiDAR semantic. Appl Intell 54, 6718–6737 (2024). https://doi.org/10.1007/s10489-024-05525-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-024-05525-8