LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds

Liu, Minghua; Zhou, Yin; Qi, Charles R.; Gong, Boqing; Su, Hao; Anguelov, Dragomir

doi:10.1007/978-3-031-19842-7_5

Minghua Liu¹²,
Yin Zhou¹³,
Charles R. Qi¹³,
Boqing Gong¹⁴,
Hao Su¹² &
…
Dragomir Anguelov¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13699))

Included in the following conference series:

European Conference on Computer Vision

4330 Accesses

Abstract

Semantic segmentation of LiDAR point clouds is an important task in autonomous driving. However, training deep models via conventional supervised methods requires large datasets which are costly to label. It is critical to have label-efficient segmentation approaches to scale up the model to new operational domains or to improve performance on rare cases. While most prior works focus on indoor scenes, we are one of the first to propose a label-efficient semantic segmentation pipeline for outdoor scenes with LiDAR point clouds. Our method co-designs an efficient labeling process with semi/weakly supervised learning and is applicable to nearly any 3D semantic segmentation backbones. Specifically, we leverage geometry patterns in outdoor scenes to have a heuristic pre-segmentation to reduce the manual labeling and jointly design the learning targets with the labeling process. In the learning step, we leverage prototype learning to get more descriptive point embeddings and use multi-scan distillation to exploit richer semantics from temporally aggregated point clouds to boost the performance of single-scan models. Evaluated on the SemanticKITTI and the nuScenes datasets, we show that our proposed method outperforms existing label-efficient methods. With extremely limited human annotations (e.g., 0.1% point labels), our proposed method is even highly competitive compared to the fully supervised counterpart with 100% labels.

M. Liu—Work done during internship at Waymo LLC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds

A Novel Method for Semantic Segmentation on Lidar Point Clouds

Open-world Semantic Segmentation for LIDAR Point Clouds

References

Alnaggar, Y.A., Afifi, M., Amer, K., ElHelw, M.: Multi projection fusion for real-time semantic segmentation of 3D lidar point clouds. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1800–1809 (2021)
Google Scholar
Alonso, I., Riazuelo, L., Montesano, L., Murillo, A.C.: 3D-MiniNet: learning a 2D representation from point clouds for fast and efficient 3D LIDAR semantic segmentation. IEEE Rob. Autom. Lett. 5(4), 5432–5439 (2020)
Article Google Scholar
Armeni, I., Sax, S., Zamir, A.R., Savarese, S.: Joint 2D–3D-semantic data for indoor scene understanding. arXiv preprint arXiv:1702.01105 (2017)
Behley, J., et al.: SemanticKITTI: a dataset for semantic scene understanding of lidar sequences. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 9297–9307 (2019)
Google Scholar
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11621–11631 (2020)
Google Scholar
Chang, A.X., et al.: ShapeNet: an information-rich 3D model repository. arXiv preprint arXiv:1512.03012 (2015)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 1597–1607. PMLR (2020)
Google Scholar
Cheng, M., Hui, L., Xie, J., Yang, J., Kong, H.: Cascaded non-local neural network for point cloud semantic segmentation. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 8447–8452. IEEE (2020)
Google Scholar
Cheng, R., Razani, R., Taghavi, E., Li, E., Liu, B.: AF2-S3Net: attentive feature fusion with adaptive feature selection for sparse semantic segmentation network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12547–12556 (2021)
Google Scholar
Cortinhal, T., Tzelepis, G., Erdal Aksoy, E.: SalsaNext: fast, uncertainty-aware semantic segmentation of LiDAR point clouds. In: Bebis, G., et al. (eds.) ISVC 2020. LNCS, vol. 12510, pp. 207–222. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64559-5_16
Chapter Google Scholar
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5828–5839 (2017)
Google Scholar
Duerr, F., Pfaller, M., Weigel, H., Beyerer, J.: LiDAR-based recurrent 3D semantic segmentation with temporal memory alignment. In: Proceedings of the International Conference on 3D Vision (3DV), pp. 781–790. IEEE (2020)
Google Scholar
Elsayed, G.F., Krishnan, D., Mobahi, H., Regan, K., Bengio, S.: Large margin deep networks for classification. In: Advances in Neural Information Processing Systems (NeurIPS) (2018)
Google Scholar
Fang, Y., Xu, C., Cui, Z., Zong, Y., Yang, J.: Spatial transformer point convolution. arXiv preprint arXiv:2009.01427 (2020)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Article MathSciNet Google Scholar
Gan, L., Zhang, R., Grizzle, J.W., Eustice, R.M., Ghaffari, M.: Bayesian spatial kernel smoothing for scalable dense semantic mapping. IEEE Rob. Autom. Lett. 5(2), 790–797 (2020)
Article Google Scholar
Gao, B., Pan, Y., Li, C., Geng, S., Zhao, H.: Are we hungry for 3d lidar data for semantic segmentation? ArXiv abs/2006.04307 3, 20 (2020)
Google Scholar
Gao, Y., Fei, N., Liu, G., Lu, Z., Xiang, T., Huang, S.: Contrastive prototype learning with augmented embeddings for few-shot learning. arXiv preprint arXiv:2101.09499 (2021)
Gerdzhev, M., Razani, R., Taghavi, E., Bingbing, L.: TORNADO-Net: mulTiview tOtal vaRiatioN semantic segmentAtion with diamond inceptiOn module. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 9543–9549. IEEE (2021)
Google Scholar
Guinard, S., Landrieu, L.: Weakly supervised segmentation-aided classification of urban scenes from 3D LiDAR point clouds. In: ISPRS Workshop 2017 (2017)
Google Scholar
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9729–9738 (2020)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. In: Advances in Neural Information Processing Systems (NeurIPS) (2015)
Google Scholar
Hou, J., Graham, B., Nießner, M., Xie, S.: Exploring data-efficient 3D scene understanding with contrastive scene contexts. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15587–15597 (2021)
Google Scholar
Hu, Q., et al.: SQN: weakly-supervised semantic segmentation of large-scale 3D point clouds with 1000x fewer labels. arXiv preprint arXiv:2104.04891 (2021)
Hu, Q., et al.: RandLA-Net: efficient semantic segmentation of large-scale point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11108–11117 (2020)
Google Scholar
Khosla, P., et al.: Supervised contrastive learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2020)
Google Scholar
Kochanov, D., Nejadasl, F.K., Booij, O.: KPRNet: improving projection-based lidar semantic segmentation. arXiv preprint arXiv:2007.12668 (2020)
Landrieu, L., Simonovsky, M.: Large-scale point cloud semantic segmentation with superpoint graphs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4558–4567 (2018)
Google Scholar
Li, J., Zhou, P., Xiong, C., Hoi, S.C.: Prototypical contrastive learning of unsupervised representations. In: Proceedings of the International Conference on Learning Representations (ICLR) (2020)
Google Scholar
Li, S., Chen, X., Liu, Y., Dai, D., Stachniss, C., Gall, J.: Multi-scale interaction for real-time lidar data segmentation on an embedded platform. arXiv preprint arXiv:2008.09162 (2020)
Liong, V.E., Nguyen, T.N.T., Widjaja, S., Sharma, D., Chong, Z.J.: AMVNet: assertion-based multi-view fusion network for lidar semantic segmentation. arXiv preprint arXiv:2012.04934 (2020)
Liu, W., Wen, Y., Yu, Z., Yang, M.: Large-margin softmax loss for convolutional neural networks. In: Proceedings of the International Conference on Machine Learning (ICML), vol. 2, p. 7 (2016)
Google Scholar
Liu, Z., Qi, X., Fu, C.W.: One thing one click: a self-training approach for weakly supervised 3D semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1726–1736 (2021)
Google Scholar
Luo, H., et al.: Semantic labeling of mobile lidar point clouds via active learning and higher order MRF. IEEE Trans. Geosci. Remote Sens. 56(7), 3631–3644 (2018)
Article Google Scholar
Mahajan, D., et al.: Exploring the limits of weakly supervised pretraining. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 185–201. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_12
Chapter Google Scholar
Mei, J., Gao, B., Xu, D., Yao, W., Zhao, X., Zhao, H.: Semantic segmentation of 3D lidar data in dynamic scene using semi-supervised learning. IEEE Trans. Intell. Transp. Syst. 21(6), 2496–2509 (2019)
Article Google Scholar
Mei, J., Zhao, H.: Incorporating human domain knowledge in 3-D LiDAR-based semantic segmentation. IEEE Transa. Intell. Veh. 5(2), 178–187 (2019)
Article MathSciNet Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 3111–3119 (2013)
Google Scholar
Milioto, A., Vizzo, I., Behley, J., Stachniss, C.: RangeNet++: fast and accurate lidar semantic segmentation. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4213–4220. IEEE (2019)
Google Scholar
Oord, A.V.D., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)
Pathak, D., Shelhamer, E., Long, J., Darrell, T.: Fully convolutional multi-class multiple instance learning. arXiv preprint arXiv:1412.7144 (2014)
Pinheiro, P.O., Collobert, R.: From image-level to pixel-level labeling with convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1713–1721 (2015)
Google Scholar
Razani, R., Cheng, R., Taghavi, E., Bingbing, L.: Lite-HDSeg: LiDAR semantic segmentation using lite harmonic dense convolutions. arXiv preprint arXiv:2103.08852 (2021)
Ren, Z., Misra, I., Schwing, A.G., Girdhar, R.: 3D spatial recognition without spatially labeled 3D. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13204–13213 (2021)
Google Scholar
Rist, C.B., Schmidt, D., Enzweiler, M., Gavrila, D.M.: SCSSNet: learning spatially-conditioned scene segmentation on LiDAR point clouds. In: 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 1086–1093. IEEE (2020)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815–823 (2015)
Google Scholar
Shi, X., Xu, X., Chen, K., Cai, L., Foo, C.S., Jia, K.: Label-efficient point cloud semantic segmentation: an active learning approach. arXiv preprint arXiv:2101.06931 (2021)
Snell, J., Swersky, K., Zemel, R.S.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2017)
Google Scholar
Sohn, K.: Improved deep metric learning with multi-class N-pair loss objective. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 1857–1865 (2016)
Google Scholar
Tang, H., et al.: Searching efficient 3D architectures with sparse point-voxel convolution. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12373, pp. 685–702. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58604-1_41
Chapter Google Scholar
Thomas, H., Agro, B., Gridseth, M., Zhang, J., Barfoot, T.D.: Self-supervised learning of lidar segmentation for autonomous indoor navigation. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 14047–14053. IEEE (2021)
Google Scholar
Thomas, H., Qi, C.R., Deschaud, J.E., Marcotegui, B., Goulette, F., Guibas, L.J.: KPConv: flexible and deformable convolution for point clouds. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 6411–6420 (2019)
Google Scholar
Wang, H., Rong, X., Yang, L., Feng, J., Xiao, J., Tian, Y.: Weakly supervised semantic segmentation in 3D graph-structured point clouds of wild scenes. arXiv preprint arXiv:2004.12498 (2020)
Wei, J., Lin, G., Yap, K.H., Hung, T.Y., Xie, L.: Multi-path region mining for weakly supervised 3D semantic segmentation on point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4384–4393 (2020)
Google Scholar
Wu, T.H., et al.: ReDAL: region-based and diversity-aware active learning for point cloud semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 15510–15519 (2021)
Google Scholar
Xiao, A., Huang, J., Guan, D., Zhan, F., Lu, S.: SynLiDAR: learning from synthetic LiDAR sequential point cloud for semantic segmentation. arXiv preprint arXiv:2107.05399 (2021)
Xie, S., Gu, J., Guo, D., Qi, C.R., Guibas, L., Litany, O.: PointContrast: unsupervised pre-training for 3D point cloud understanding. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 574–591. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_34
Chapter Google Scholar
Xu, C., et al.: SqueezeSegV3: spatially-adaptive convolution for efficient point-cloud segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12373, pp. 1–19. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58604-1_1
Chapter Google Scholar
Xu, J., Zhang, R., Dou, J., Zhu, Y., Sun, J., Pu, S.: RpvNet: a deep and efficient range-point-voxel fusion network for LiDAR point cloud segmentation. arXiv preprint arXiv:2103.12978 (2021)
Xu, K., Yao, Y., Murasaki, K., Ando, S., Sagata, A.: Semantic segmentation of sparsely annotated 3D point clouds by pseudo-labelling. In: Proceedings of the International Conference on 3D Vision (3DV), pp. 463–471. IEEE (2019)
Google Scholar
Xu, X., Lee, G.H.: Weakly supervised semantic point cloud segmentation: towards 10x fewer labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13706–13715 (2020)
Google Scholar
Yan, X., et al.: Sparse single sweep LiDAR point cloud segmentation via learning contextual shape priors from scene completion. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) (2020)
Google Scholar
Yang, H.M., Zhang, X.Y., Yin, F., Liu, C.L.: Robust classification with convolutional prototype learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3474–3482 (2018)
Google Scholar
Zhang, F., Fang, J., Wah, B., Torr, P.: Deep FusionNet for point cloud semantic segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12369, pp. 644–663. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58586-0_38
Chapter Google Scholar
Zhang, Y., et al.: PolarNet: an improved grid representation for online lidar point clouds semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9601–9610 (2020)
Google Scholar
Zhao, N., Chua, T.S., Lee, G.H.: Few-shot 3D point cloud semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8873–8882 (2021)
Google Scholar
Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Google Scholar
Zhu, X., et al.: Cylindrical and asymmetrical 3D convolution networks for LiDAR segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9939–9948 (2021)
Google Scholar
Zou, Y., Weinacker, H., Koch, B.: Towards urban scene semantic segmentation with deep learning from LiDAR point clouds: a case study in Baden-Württemberg, Germany. Remote Sens. 13(16), 3220 (2021)
Article Google Scholar

Download references

Author information

Authors and Affiliations

UC San Diego, San Diego, USA
Minghua Liu & Hao Su
Waymo, Mountain View, USA
Yin Zhou, Charles R. Qi & Dragomir Anguelov
Google, Washington, D.C., USA
Boqing Gong

Authors

Minghua Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yin Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Charles R. Qi
View author publications
You can also search for this author in PubMed Google Scholar
Boqing Gong
View author publications
You can also search for this author in PubMed Google Scholar
Hao Su
View author publications
You can also search for this author in PubMed Google Scholar
Dragomir Anguelov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yin Zhou .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 8294 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, M., Zhou, Y., Qi, C.R., Gong, B., Su, H., Anguelov, D. (2022). LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13699. Springer, Cham. https://doi.org/10.1007/978-3-031-19842-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-19842-7_5
Published: 23 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19841-0
Online ISBN: 978-3-031-19842-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds