A Cylindrical Convolution Network for Dense Top-View Semantic Segmentation with LiDAR Point Clouds

Lu, Jiacheng; Gu, Shuo; Xu, Cheng-Zhong; Kong, Hui

doi:10.1007/978-3-031-26293-7_21

Jiacheng Lu¹²,
Shuo Gu¹²,
Cheng-Zhong Xu¹³ &
…
Hui Kong¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13847))

Included in the following conference series:

Asian Conference on Computer Vision

371 Accesses
1 Citations

Abstract

Accurate semantic scene understanding of the surrounding environment is a challenge for autonomous driving systems. Recent LiDA- R-based semantic segmentation methods mainly focus on predicting point-wise semantic classes, which cannot be directly used before the further densification process. In this paper, we propose a cylindrical convolution network for dense semantic understanding in the top-view LiDAR data representation. 3D LiDAR point clouds are divided into cylindrical partitions before feeding to the network, where semantic segmentation is conducted in the cylindrical representation. Then a cylinder-to-BEV transformation module is introduced to obtain sparse semantic feature maps in the top view. In the end, we propose a modified encoder-decoder network to get the dense semantic estimations. Experimental results on the SemanticKITTI and nuScenes-LidarSeg datasets show that our method outperforms the state-of-the-art methods with a large margin.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
BEV is another expression for top view.

References

Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Romera, E., Alvarez, J.M., Bergasa, L.M., Arroyo, R.: ERFNet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans. Intell. Transp. Syst. 19, 263–272 (2017)
Article Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Behley, J., et al.: SemanticKITTI: a dataset for semantic scene understanding of lidar sequences. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9297–9307 (2019)
Google Scholar
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621–11631 (2020)
Google Scholar
Cortinhal, T., Tzelepis, G., Erdal Aksoy, E.: SalsaNext: fast, uncertainty-aware semantic segmentation of LiDAR point clouds. In: Bebis, G., et al. (eds.) ISVC 2020. LNCS, vol. 12510, pp. 207–222. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64559-5_16
Chapter Google Scholar
Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O.: PointPillars: fast encoders for object detection from point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12697–12705 (2019)
Google Scholar
Fei, J., Peng, K., Heidenreich, P., Bieder, F., Stiller, C.: PillarSegNet: pillar-based semantic grid map estimation using sparse lidar data. In: 2021 IEEE Intelligent Vehicles Symposium (IV), pp. 838–844. IEEE (2021)
Google Scholar
Peng, K., et al.: MASS: multi-attentional semantic segmentation of lidar data for dense top-view understanding. IEEE Trans. Intell. Transp. Syst. 23(9), 15824–15840 (2022)
Article Google Scholar
Huang, J., Huang, G., Zhu, Z., Du, D.: BEVDet: high-performance multi-camera 3D object detection in bird-eye-view. arXiv preprint arXiv:2112.11790 (2021)
Ng, M.H., Radia, K., Chen, J., Wang, D., Gog, I., Gonzalez, J.E.: BEV-seg: bird’s eye view semantic segmentation using geometry and semantic point cloud. arXiv preprint arXiv:2006.11436 (2020)
Pan, B., Sun, J., Leung, H.Y.T., Andonian, A., Zhou, B.: Cross-view semantic segmentation for sensing surroundings. IEEE Robot. Autom. Lett. 5, 4867–4873 (2020)
Article Google Scholar
Roddick, T., Cipolla, R.: Predicting semantic map representations from images using pyramid occupancy networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11138–11147 (2020)
Google Scholar
Philion, J., Fidler, S.: Lift, splat, shoot: encoding images from arbitrary camera rigs by implicitly unprojecting to 3D. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 194–210. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_12
Chapter Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Yang, W., et al.: Projecting your view attentively: monocular road scene layout estimation via cross-view transformation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15536–15545 (2021)
Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Hu, Q., et al.: RandLA-Net: efficient semantic segmentation of large-scale point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11108–11117 (2020)
Google Scholar
Thomas, H., Qi, C.R., Deschaud, J.E., Marcotegui, B., Goulette, F., Guibas, L.J.: KPConv: flexible and deformable convolution for point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6411–6420 (2019)
Google Scholar
Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)
Google Scholar
Rethage, D., Wald, J., Sturm, J., Navab, N., Tombari, F.: Fully-convolutional point networks for large-scale point clouds. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 596–611 (2018)
Google Scholar
Zhu, X., et al.: Cylindrical and asymmetrical 3D convolution networks for lidar segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9939–9948 (2021)
Google Scholar
Wu, B., Wan, A., Yue, X., Keutzer, K.: SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D lidar point cloud. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1887–1893. IEEE (2018)
Google Scholar
Wu, B., Zhou, X., Zhao, S., Yue, X., Keutzer, K.: SqueezeSegV2: improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 4376–4382. IEEE (2019)
Google Scholar
Milioto, A., Vizzo, I., Behley, J., Stachniss, C.: RangeNet++: fast and accurate lidar semantic segmentation. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4213–4220. IEEE (2019)
Google Scholar
Aksoy, E.E., Baci, S., Cavdar, S.: SalsaNet: fast road and vehicle segmentation in lidar point clouds for autonomous driving. In: 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 926–932. IEEE (2020)
Google Scholar
Bieder, F., Wirges, S., Janosovits, J., Richter, S., Wang, Z., Stiller, C.: Exploiting multi-layer grid maps for surround-view semantic segmentation of sparse lidar data. In: 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 1892–1898. IEEE (2020)
Google Scholar
Paigwar, A., Erkent, Ö., Sierra-Gonzalez, D., Laugier, C.: GndNet: fast ground plane estimation and point cloud segmentation for autonomous vehicles. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2150–2156. IEEE (2020)
Google Scholar
Kong, X., Zhai, G., Zhong, B., Liu, Y.: PASS3D: precise and accelerated semantic segmentation for 3D POINT cloud. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3467–3473. IEEE (2019)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Google Scholar
Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2820–2828 (2019)
Google Scholar
Berman, M., Triki, A.R., Blaschko, M.B.: The lovász-softmax loss: a tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4413–4421 (2018)
Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Found of China (Grant No. 62106106), the Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education (Nanjing University of Science and Technology, Grant JYB202106), the National Key Research and Development Program of China (No. 2019YFB2102100), the Science and Technology Development Fund of Macau SAR (File no. 0015/2019/AKP and AGJ-2021-0046), the Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems (No. 2019B121205007) and the startup project of Macau University (SRG2021-00022-IOTSC and SKL-IOTSC(UM)-2021-2023).

Author information

Authors and Affiliations

PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, and Jiangsu Key Lab of Image and Video Understanding for Social Security, School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
Jiacheng Lu & Shuo Gu
University of Macau, Macau, China
Cheng-Zhong Xu & Hui Kong

Authors

Jiacheng Lu
View author publications
You can also search for this author in PubMed Google Scholar
Shuo Gu
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Zhong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Hui Kong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Shuo Gu or Hui Kong .

Editor information

Editors and Affiliations

University of Wollongong, Wollongong, NSW, Australia
Lei Wang
University of Bonn, Bonn, Germany
Juergen Gall
University of Adelaide, Adelaide, SA, Australia
Tat-Jun Chin
National Institute of Informatics, Tokyo, Japan
Imari Sato
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 2 (mp4 73825 KB)

Supplementary material 1 (pdf 3332 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lu, J., Gu, S., Xu, CZ., Kong, H. (2023). A Cylindrical Convolution Network for Dense Top-View Semantic Segmentation with LiDAR Point Clouds. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13847. Springer, Cham. https://doi.org/10.1007/978-3-031-26293-7_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-26293-7_21
Published: 11 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26292-0
Online ISBN: 978-3-031-26293-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Cylindrical Convolution Network for Dense Top-View Semantic Segmentation with LiDAR Point Clouds