Skip to main content
Log in

A multi-granularity semisupervised active learning for point cloud semantic segmentation

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Recent successes in point cloud semantic segmentation heavily rely on a large amount of annotated data. Furthermore, three-dimensional point cloud data are generally sparse and unorganized, and a frame of point cloud usually includes more than 100,000 points, which increases the difficulty of point cloud annotation. To reduce the annotation efforts, we propose a multi-granularity semisupervised active learning pipeline which aims to select representative, uncertain and diverse data to annotate. To better exploit annotating budget, we first leverage the conventional point cloud registration algorithm to develop a matching score function which is used to select a representative subset. And then we change the annotating units from a point cloud scan to segmented regions through two semisupervised methods. Subsequently, in each active selection step, segmented region information is calculated with two terms: softmax entropy and point cloud intensity, and the latter serves to encourage region diversity. Finally, to further reduce annotation effort, semisupervised learning is introduced to our pipeline to automatically select a portion of unlabeled segmented regions with high confidence and assign pseudolabels to them. Extensive experiments show that our approach greatly outperforms previous active learning methods, and we obtain the mean class intersection-over-union performance of 95% fully supervised learning with merely 3% of labeled data on SemanticKITTI dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author (Z. Pan) and S. Ye on reasonable request.

References

  1. Abdel-Salam R, Mostafa R, Abdel-Gawad AH (2022) RIECNN: real-time image enhanced CNN for traffic sign recognition. Neural Comput Appl 34:6085–6096. https://doi.org/10.1007/s00521-021-06762-5

    Article  Google Scholar 

  2. Aodha OM, Campbell ND, Kautz J et al (2014) Hierarchical subquery evaluation for active learning on a graph. In: 2014 IEEE conference on computer vision and pattern recognition, pp 564–571. https://doi.org/10.1109/CVPR.2014.79

  3. Behley J, Garbade M, Milioto A et al (2019) SemanticKITTI: a dataset for semantic scene understanding of lidar sequences. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 9296–9306. https://doi.org/10.1109/ICCV.2019.00939

  4. Beluch WH, Genewein T, Nurnberger A et al (2018) The power of ensembles for active learning in image classification. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 9368–9377. https://doi.org/10.1109/CVPR.2018.00976

  5. Biber P, Straßer W (2003) The normal distributions transform: a new approach to laser scan matching. In: 2003 IEEE/RSJ international conference on intelligent robots and systems, Las Vegas, Nevada, USA, October 27–November 1, 2003. IEEE, pp 2743–2748. https://doi.org/10.1109/IROS.2003.1249285

  6. Casanova A, Pinheiro PO, Rostamzadeh N et al (2020) Reinforced active learning for image segmentation. In: 8th International conference on learning representations, ICLR 2020, Addis Ababa, Ethiopia, April 26–30, 2020. OpenReview.net

  7. Choy C, Gwak J, Savarese S (2019) 4D spatio-temporal convnets: Minkowski convolutional neural networks. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 3070–3079. https://doi.org/10.1109/CVPR.2019.00319

  8. Dagan I, Engelson SP (1995) Committee-based sampling for training probabilistic classifiers. In: Machine learning, proceedings of the twelfth international conference on machine learning, Tahoe City, California, USA, July 9–12, 1995. Morgan Kaufmann, pp 150–157. https://doi.org/10.1016/b978-1-55860-377-6.50027-x

  9. Dai A, Chang AX, Savva M et al (2017) Scannet: Richly-annotated 3d reconstructions of indoor scenes. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 2432–2443. https://doi.org/10.1109/CVPR.2017.261

  10. Deng C, Xue Y, Liu X et al (2019) Active transfer learning network: a unified deep joint spectral-spatial feature learning model for hyperspectral image classification. IEEE Trans Geosci Remote Sens 57(3):1741–1754. https://doi.org/10.1109/TGRS.2018.2868851

    Article  Google Scholar 

  11. Deng S, Dong Q, Liu B, Hu Z (2022) Superpoint-guided semi-supervised semantic segmentation of 3D point clouds. In: 2022 International conference on robotics and automation (ICRA), pp 9214–9220. https://doi.org/10.1109/ICRA46639.2022.9811904

  12. Gal Y, Islam R, Ghahramani Z (2017) Deep Bayesian active learning with image data. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, proceedings of machine learning research, vol 70. PMLR, pp 1183–1192

  13. Gu B, Zhai Z, Deng C et al (2021) Efficient active learning by querying discriminative and representative samples and fully exploiting unlabeled data. IEEE Trans Neural Netw Learn Syst 32(9):4111–4122. https://doi.org/10.1109/TNNLS.2020.3016928

    Article  MathSciNet  Google Scholar 

  14. Guo Y (2010) Active instance sampling via matrix partition. In: Advances in neural information processing systems, vol 23. Curran Associates Inc., pp 802–810

  15. Hackel T, Savinov N, Ladicky L et al (2017) Semantic3d.net: a new large-scale point cloud classification benchmark. In: ISPRS Annals of the photogrammetry, remote sensing and spatial information sciences, pp 91–98

  16. Hinton GE, Srivastava N, Krizhevsky A et al (2012) Improving neural networks by preventing co-adaptation of feature detectors. CoRR arXiv:1207.0580

  17. Hossain HMS, Roy N (2019) Active deep learning for activity recognition with context aware annotator selection. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, KDD 2019, Anchorage, AK, USA, August 4–8, 2019. ACM, pp 1862–1870. https://doi.org/10.1145/3292500.3330688

  18. Hu Q, Yang B, Xie L et al (2020) Randla-net: efficient semantic segmentation of large-scale point clouds. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 11105–11114. https://doi.org/10.1109/CVPR42600.2020.01112

  19. Hui L, Di L, Xianfeng H et al (2008) Laser intensity used in classification of lidar point cloud data. In: IGARSS 2008—2008 IEEE international geoscience and remote sensing symposium, pp II-1140–II-1143. https://doi.org/10.1109/IGARSS.2008.4779201

  20. Joshi AJ, Porikli F, Papanikolopoulos N (2009) Multi-class active learning for image classification. In: 2009 IEEE conference on computer vision and pattern recognition, pp 2372–2379. https://doi.org/10.1109/CVPR.2009.5206627

  21. Käding C, Rodner E, Freytag A et al (2016) Active and continuous exploration with deep neural networks and expected model output changes. CoRR arXiv:1612.06129

  22. Konyushkova K, Sznitman R, Fua P (2015) Introducing geometry in active learning for image segmentation. In: 2015 IEEE international conference on computer vision (ICCV), pp 2974–2982. https://doi.org/10.1109/ICCV.2015.340

  23. Lewis DD, Catlett J (1994) Heterogeneous uncertainty sampling for supervised learning. In: Machine learning, proceedings of the eleventh international conference, Rutgers University, New Brunswick, NJ, USA, July 10–13, 1994. Morgan Kaufmann, pp 148–156. https://doi.org/10.1016/b978-1-55860-335-6.50026-x

  24. Li J, Jiang F, Yang J et al (2021) Lane-deeplab: lane semantic segmentation in automatic driving scenarios for high-definition maps. Neurocomputing 465:15–25. https://doi.org/10.1016/j.neucom.2021.08.105

    Article  Google Scholar 

  25. Lin Y, Vosselman G, Cao Y et al (2020) Active and incremental learning for semantic ALS point cloud segmentation. ISPRS J Photogramm Remote Sens 169:73–92. https://doi.org/10.1016/j.isprsjprs.2020.09.003

    Article  Google Scholar 

  26. Lin Y, Vosselman G, Cao Y et al (2020) Efficient training of semantic point cloud segmentation via active learning. In: ISPRS annals of the photogrammetry, remote sensing and spatial information sciences, pp 243–250. https://doi.org/10.5194/isprs-annals-V-2-2020-243-2020

  27. Liu C, Li J, He L (2019) Superpixel-based semisupervised active learning for hyperspectral image classification. IEEE J Sel Top Appl Earth Observ Remote Sens 12(1):357–370. https://doi.org/10.1109/JSTARS.2018.2880562

    Article  Google Scholar 

  28. Liu Z, Wang J, Gong S et al (2019) Deep reinforcement active learning for human-in-the-loop person re-identification. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 6121–6130. https://doi.org/10.1109/ICCV.2019.00622

  29. Liu Z, Tang H, Zhao S et al (2021) Pvnas: 3d neural architecture search with point-voxel convolution. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3109025

    Article  Google Scholar 

  30. Luo W, Schwing AG, Urtasun R (2013) Latent structured active learning. In: Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 728–736

  31. Nguyen HT, Smeulders AWM (2004) Active learning using pre-clustering. In: Machine learning, proceedings of the twenty-first international conference ICML 2004, Banff, Alberta, Canada, July 4–8, 2004, ACM international conference proceeding series, vol 69. ACM. https://doi.org/10.1145/1015330.1015349

  32. Pan Y, Pi D, Chen J et al (2021) FDPPGAN: remote sensing image fusion based on deep perceptual patchGAN. Neural Comput Appl 33:9589–9605. https://doi.org/10.1007/s00521-021-05724-1

    Article  Google Scholar 

  33. Papon J, Abramov A, Schoeler M et al (2013) Voxel cloud connectivity segmentation - supervoxels for point clouds. In: 2013 IEEE conference on computer vision and pattern recognition, pp 2027–2034. https://doi.org/10.1109/CVPR.2013.264

  34. Peng K, Fei J, Yang K et al (2022) MASS: multi-attentional semantic segmentation of LiDAR data for dense top-view understanding. IEEE Trans Intell Transp Syst 23(9):15824–15840. https://doi.org/10.1109/TITS.2022.3145588

    Article  Google Scholar 

  35. Qi CR, Yi L, Su H et al (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp 5099–5108

  36. Ren P, Xiao Y, Chang X et al (2022) A survey of deep active learning. ACM Comput Surv 54(9):180:1-180:40. https://doi.org/10.1145/3472291

    Article  Google Scholar 

  37. Riegler G, Ulusoy AO, Geiger A (2017) Octnet: learning deep 3d representations at high resolutions. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6620–6629. https://doi.org/10.1109/CVPR.2017.701

  38. Roy N, Mccallum A (2001) Toward optimal active learning through Monte Carlo estimation of error reduction. In: Proceedings of the international conference on machine learning, pp 441–448

  39. Rusu RB, Blodow N, Beetz M (2009) Fast point feature histograms (FPFH) for 3D registration. In: 2009 IEEE international conference on robotics and automation, pp 3212–3217. https://doi.org/10.1109/ROBOT.2009.5152473

  40. Sener O, Savarese S (2018) Active learning for convolutional neural networks: a core-set approach. In: 6th International conference on learning representations, ICLR 2018, Vancouver, BC, Canada, April 30–May 3, 2018, conference track proceedings. OpenReview.net

  41. Settles B, Craven M, Ray S (2007) Multiple-instance active learning. In: Advances in neural information processing systems 20, proceedings of the twenty-first annual conference on neural information processing systems, Vancouver, British Columbia, Canada, December 3–6, 2007. Curran Associates Inc., pp 1289–1296

  42. Seung HS, Opper M, Sompolinsky H (1992) Query by committee. In: Proceedings of the fifth annual ACM conference on computational learning theory, COLT 1992, Pittsburgh, PA, USA, July 27–29, 1992. ACM, pp 287–294. https://doi.org/10.1145/130385.130417

  43. Shi X, Xu X, Chen K et al (2021) Label-efficient point cloud semantic segmentation: an active learning approach. CoRR arXiv:2101.06931

  44. Siddiqui Y, Valentin J, Niessner M (2020) Viewal: active learning with viewpoint entropy for semantic segmentation. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 9430–9440. https://doi.org/10.1109/CVPR42600.2020.00945

  45. Siméoni O, Budnik M, Avrithis Y et al (2021) Rethinking deep active learning: using unlabeled data at model training. In: 2020 25th International conference on pattern recognition (ICPR), pp 1220–1227. https://doi.org/10.1109/ICPR48806.2021.9412716

  46. Stein SC, Schoeler M, Papon J et al (2014) Object partitioning using local convexity. In: 2014 IEEE conference on computer vision and pattern recognition, pp 304–311. https://doi.org/10.1109/CVPR.2014.46

  47. Tatarchenko M, Park J, Koltun V et al (2018) Tangent convolutions for dense prediction in 3D. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 3887–3896. https://doi.org/10.1109/CVPR.2018.00409

  48. Tran T, Do T, Reid ID et al (2019) Bayesian generative active deep learning. In: Proceedings of the 36th international conference on machine learning, ICML 2019, 9–15 June 2019, Long Beach, California, USA, proceedings of machine learning research, vol 97. PMLR, pp 6295–6304

  49. Unal O, Dai D, Gool L van, Zurich E (2022) Scribble-supervised LiDAR semantic segmentation. In: 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2697–2707

  50. Wang K, Zhang D, Li Y et al (2017) Cost-effective active learning for deep image classification. IEEE Trans Circuits Syst Video Technol 27(12):2591–2600. https://doi.org/10.1109/TCSVT.2016.2589879

    Article  Google Scholar 

  51. Wang J-X, Chen S-B, Ding CHQ, Tang J, Luo B (2022) RanPaste: paste consistency and pseudo label for semisupervised remote sensing image semantic segmentation. IEEE Trans Geosci Remote Sens 60:1–16. https://doi.org/10.1109/TGRS.2021.3102026

    Article  Google Scholar 

  52. Wu TH, Liu YC, Huang YK et al (2021) Redal: region-based and diversity-aware active learning for point cloud semantic segmentation. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp 15490–15499. https://doi.org/10.1109/ICCV48922.2021.01522

  53. Xie B, Yuan L, Li S, Liu CH, Cheng X (2022) Towards fewer annotations: active learning via region impurity and prediction uncertainty for domain adaptive semantic segmentation. In: 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 8058–8068. https://doi.org/10.1109/CVPR52688.2022.00790

  54. Yoo D, Kweon IS (2019) Learning loss for active learning. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 93–102. https://doi.org/10.1109/CVPR.2019.00018

  55. Yuan T, Wan F, Fu M et al (2021) Multiple instance active learning for object detection. In: 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 5326–5335. https://doi.org/10.1109/CVPR46437.2021.00529

Download references

Acknowledgements

The work was supported in part by the National Key Research and Development Program of China under Grant No. 2021YFB2501300 and in part by the National Important Science & Technology Specific Projects under Grant No. 2017ZX01038201.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhijie Pan.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ye, S., Yin, Z., Fu, Y. et al. A multi-granularity semisupervised active learning for point cloud semantic segmentation. Neural Comput & Applic 35, 15629–15645 (2023). https://doi.org/10.1007/s00521-023-08455-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08455-7

Keywords

Navigation