Abstract
Recent successes in point cloud semantic segmentation heavily rely on a large amount of annotated data. Furthermore, three-dimensional point cloud data are generally sparse and unorganized, and a frame of point cloud usually includes more than 100,000 points, which increases the difficulty of point cloud annotation. To reduce the annotation efforts, we propose a multi-granularity semisupervised active learning pipeline which aims to select representative, uncertain and diverse data to annotate. To better exploit annotating budget, we first leverage the conventional point cloud registration algorithm to develop a matching score function which is used to select a representative subset. And then we change the annotating units from a point cloud scan to segmented regions through two semisupervised methods. Subsequently, in each active selection step, segmented region information is calculated with two terms: softmax entropy and point cloud intensity, and the latter serves to encourage region diversity. Finally, to further reduce annotation effort, semisupervised learning is introduced to our pipeline to automatically select a portion of unlabeled segmented regions with high confidence and assign pseudolabels to them. Extensive experiments show that our approach greatly outperforms previous active learning methods, and we obtain the mean class intersection-over-union performance of 95% fully supervised learning with merely 3% of labeled data on SemanticKITTI dataset.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author (Z. Pan) and S. Ye on reasonable request.
References
Abdel-Salam R, Mostafa R, Abdel-Gawad AH (2022) RIECNN: real-time image enhanced CNN for traffic sign recognition. Neural Comput Appl 34:6085–6096. https://doi.org/10.1007/s00521-021-06762-5
Aodha OM, Campbell ND, Kautz J et al (2014) Hierarchical subquery evaluation for active learning on a graph. In: 2014 IEEE conference on computer vision and pattern recognition, pp 564–571. https://doi.org/10.1109/CVPR.2014.79
Behley J, Garbade M, Milioto A et al (2019) SemanticKITTI: a dataset for semantic scene understanding of lidar sequences. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 9296–9306. https://doi.org/10.1109/ICCV.2019.00939
Beluch WH, Genewein T, Nurnberger A et al (2018) The power of ensembles for active learning in image classification. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 9368–9377. https://doi.org/10.1109/CVPR.2018.00976
Biber P, Straßer W (2003) The normal distributions transform: a new approach to laser scan matching. In: 2003 IEEE/RSJ international conference on intelligent robots and systems, Las Vegas, Nevada, USA, October 27–November 1, 2003. IEEE, pp 2743–2748. https://doi.org/10.1109/IROS.2003.1249285
Casanova A, Pinheiro PO, Rostamzadeh N et al (2020) Reinforced active learning for image segmentation. In: 8th International conference on learning representations, ICLR 2020, Addis Ababa, Ethiopia, April 26–30, 2020. OpenReview.net
Choy C, Gwak J, Savarese S (2019) 4D spatio-temporal convnets: Minkowski convolutional neural networks. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 3070–3079. https://doi.org/10.1109/CVPR.2019.00319
Dagan I, Engelson SP (1995) Committee-based sampling for training probabilistic classifiers. In: Machine learning, proceedings of the twelfth international conference on machine learning, Tahoe City, California, USA, July 9–12, 1995. Morgan Kaufmann, pp 150–157. https://doi.org/10.1016/b978-1-55860-377-6.50027-x
Dai A, Chang AX, Savva M et al (2017) Scannet: Richly-annotated 3d reconstructions of indoor scenes. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 2432–2443. https://doi.org/10.1109/CVPR.2017.261
Deng C, Xue Y, Liu X et al (2019) Active transfer learning network: a unified deep joint spectral-spatial feature learning model for hyperspectral image classification. IEEE Trans Geosci Remote Sens 57(3):1741–1754. https://doi.org/10.1109/TGRS.2018.2868851
Deng S, Dong Q, Liu B, Hu Z (2022) Superpoint-guided semi-supervised semantic segmentation of 3D point clouds. In: 2022 International conference on robotics and automation (ICRA), pp 9214–9220. https://doi.org/10.1109/ICRA46639.2022.9811904
Gal Y, Islam R, Ghahramani Z (2017) Deep Bayesian active learning with image data. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, proceedings of machine learning research, vol 70. PMLR, pp 1183–1192
Gu B, Zhai Z, Deng C et al (2021) Efficient active learning by querying discriminative and representative samples and fully exploiting unlabeled data. IEEE Trans Neural Netw Learn Syst 32(9):4111–4122. https://doi.org/10.1109/TNNLS.2020.3016928
Guo Y (2010) Active instance sampling via matrix partition. In: Advances in neural information processing systems, vol 23. Curran Associates Inc., pp 802–810
Hackel T, Savinov N, Ladicky L et al (2017) Semantic3d.net: a new large-scale point cloud classification benchmark. In: ISPRS Annals of the photogrammetry, remote sensing and spatial information sciences, pp 91–98
Hinton GE, Srivastava N, Krizhevsky A et al (2012) Improving neural networks by preventing co-adaptation of feature detectors. CoRR arXiv:1207.0580
Hossain HMS, Roy N (2019) Active deep learning for activity recognition with context aware annotator selection. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, KDD 2019, Anchorage, AK, USA, August 4–8, 2019. ACM, pp 1862–1870. https://doi.org/10.1145/3292500.3330688
Hu Q, Yang B, Xie L et al (2020) Randla-net: efficient semantic segmentation of large-scale point clouds. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 11105–11114. https://doi.org/10.1109/CVPR42600.2020.01112
Hui L, Di L, Xianfeng H et al (2008) Laser intensity used in classification of lidar point cloud data. In: IGARSS 2008—2008 IEEE international geoscience and remote sensing symposium, pp II-1140–II-1143. https://doi.org/10.1109/IGARSS.2008.4779201
Joshi AJ, Porikli F, Papanikolopoulos N (2009) Multi-class active learning for image classification. In: 2009 IEEE conference on computer vision and pattern recognition, pp 2372–2379. https://doi.org/10.1109/CVPR.2009.5206627
Käding C, Rodner E, Freytag A et al (2016) Active and continuous exploration with deep neural networks and expected model output changes. CoRR arXiv:1612.06129
Konyushkova K, Sznitman R, Fua P (2015) Introducing geometry in active learning for image segmentation. In: 2015 IEEE international conference on computer vision (ICCV), pp 2974–2982. https://doi.org/10.1109/ICCV.2015.340
Lewis DD, Catlett J (1994) Heterogeneous uncertainty sampling for supervised learning. In: Machine learning, proceedings of the eleventh international conference, Rutgers University, New Brunswick, NJ, USA, July 10–13, 1994. Morgan Kaufmann, pp 148–156. https://doi.org/10.1016/b978-1-55860-335-6.50026-x
Li J, Jiang F, Yang J et al (2021) Lane-deeplab: lane semantic segmentation in automatic driving scenarios for high-definition maps. Neurocomputing 465:15–25. https://doi.org/10.1016/j.neucom.2021.08.105
Lin Y, Vosselman G, Cao Y et al (2020) Active and incremental learning for semantic ALS point cloud segmentation. ISPRS J Photogramm Remote Sens 169:73–92. https://doi.org/10.1016/j.isprsjprs.2020.09.003
Lin Y, Vosselman G, Cao Y et al (2020) Efficient training of semantic point cloud segmentation via active learning. In: ISPRS annals of the photogrammetry, remote sensing and spatial information sciences, pp 243–250. https://doi.org/10.5194/isprs-annals-V-2-2020-243-2020
Liu C, Li J, He L (2019) Superpixel-based semisupervised active learning for hyperspectral image classification. IEEE J Sel Top Appl Earth Observ Remote Sens 12(1):357–370. https://doi.org/10.1109/JSTARS.2018.2880562
Liu Z, Wang J, Gong S et al (2019) Deep reinforcement active learning for human-in-the-loop person re-identification. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 6121–6130. https://doi.org/10.1109/ICCV.2019.00622
Liu Z, Tang H, Zhao S et al (2021) Pvnas: 3d neural architecture search with point-voxel convolution. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3109025
Luo W, Schwing AG, Urtasun R (2013) Latent structured active learning. In: Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 728–736
Nguyen HT, Smeulders AWM (2004) Active learning using pre-clustering. In: Machine learning, proceedings of the twenty-first international conference ICML 2004, Banff, Alberta, Canada, July 4–8, 2004, ACM international conference proceeding series, vol 69. ACM. https://doi.org/10.1145/1015330.1015349
Pan Y, Pi D, Chen J et al (2021) FDPPGAN: remote sensing image fusion based on deep perceptual patchGAN. Neural Comput Appl 33:9589–9605. https://doi.org/10.1007/s00521-021-05724-1
Papon J, Abramov A, Schoeler M et al (2013) Voxel cloud connectivity segmentation - supervoxels for point clouds. In: 2013 IEEE conference on computer vision and pattern recognition, pp 2027–2034. https://doi.org/10.1109/CVPR.2013.264
Peng K, Fei J, Yang K et al (2022) MASS: multi-attentional semantic segmentation of LiDAR data for dense top-view understanding. IEEE Trans Intell Transp Syst 23(9):15824–15840. https://doi.org/10.1109/TITS.2022.3145588
Qi CR, Yi L, Su H et al (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp 5099–5108
Ren P, Xiao Y, Chang X et al (2022) A survey of deep active learning. ACM Comput Surv 54(9):180:1-180:40. https://doi.org/10.1145/3472291
Riegler G, Ulusoy AO, Geiger A (2017) Octnet: learning deep 3d representations at high resolutions. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6620–6629. https://doi.org/10.1109/CVPR.2017.701
Roy N, Mccallum A (2001) Toward optimal active learning through Monte Carlo estimation of error reduction. In: Proceedings of the international conference on machine learning, pp 441–448
Rusu RB, Blodow N, Beetz M (2009) Fast point feature histograms (FPFH) for 3D registration. In: 2009 IEEE international conference on robotics and automation, pp 3212–3217. https://doi.org/10.1109/ROBOT.2009.5152473
Sener O, Savarese S (2018) Active learning for convolutional neural networks: a core-set approach. In: 6th International conference on learning representations, ICLR 2018, Vancouver, BC, Canada, April 30–May 3, 2018, conference track proceedings. OpenReview.net
Settles B, Craven M, Ray S (2007) Multiple-instance active learning. In: Advances in neural information processing systems 20, proceedings of the twenty-first annual conference on neural information processing systems, Vancouver, British Columbia, Canada, December 3–6, 2007. Curran Associates Inc., pp 1289–1296
Seung HS, Opper M, Sompolinsky H (1992) Query by committee. In: Proceedings of the fifth annual ACM conference on computational learning theory, COLT 1992, Pittsburgh, PA, USA, July 27–29, 1992. ACM, pp 287–294. https://doi.org/10.1145/130385.130417
Shi X, Xu X, Chen K et al (2021) Label-efficient point cloud semantic segmentation: an active learning approach. CoRR arXiv:2101.06931
Siddiqui Y, Valentin J, Niessner M (2020) Viewal: active learning with viewpoint entropy for semantic segmentation. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 9430–9440. https://doi.org/10.1109/CVPR42600.2020.00945
Siméoni O, Budnik M, Avrithis Y et al (2021) Rethinking deep active learning: using unlabeled data at model training. In: 2020 25th International conference on pattern recognition (ICPR), pp 1220–1227. https://doi.org/10.1109/ICPR48806.2021.9412716
Stein SC, Schoeler M, Papon J et al (2014) Object partitioning using local convexity. In: 2014 IEEE conference on computer vision and pattern recognition, pp 304–311. https://doi.org/10.1109/CVPR.2014.46
Tatarchenko M, Park J, Koltun V et al (2018) Tangent convolutions for dense prediction in 3D. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 3887–3896. https://doi.org/10.1109/CVPR.2018.00409
Tran T, Do T, Reid ID et al (2019) Bayesian generative active deep learning. In: Proceedings of the 36th international conference on machine learning, ICML 2019, 9–15 June 2019, Long Beach, California, USA, proceedings of machine learning research, vol 97. PMLR, pp 6295–6304
Unal O, Dai D, Gool L van, Zurich E (2022) Scribble-supervised LiDAR semantic segmentation. In: 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2697–2707
Wang K, Zhang D, Li Y et al (2017) Cost-effective active learning for deep image classification. IEEE Trans Circuits Syst Video Technol 27(12):2591–2600. https://doi.org/10.1109/TCSVT.2016.2589879
Wang J-X, Chen S-B, Ding CHQ, Tang J, Luo B (2022) RanPaste: paste consistency and pseudo label for semisupervised remote sensing image semantic segmentation. IEEE Trans Geosci Remote Sens 60:1–16. https://doi.org/10.1109/TGRS.2021.3102026
Wu TH, Liu YC, Huang YK et al (2021) Redal: region-based and diversity-aware active learning for point cloud semantic segmentation. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp 15490–15499. https://doi.org/10.1109/ICCV48922.2021.01522
Xie B, Yuan L, Li S, Liu CH, Cheng X (2022) Towards fewer annotations: active learning via region impurity and prediction uncertainty for domain adaptive semantic segmentation. In: 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 8058–8068. https://doi.org/10.1109/CVPR52688.2022.00790
Yoo D, Kweon IS (2019) Learning loss for active learning. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 93–102. https://doi.org/10.1109/CVPR.2019.00018
Yuan T, Wan F, Fu M et al (2021) Multiple instance active learning for object detection. In: 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 5326–5335. https://doi.org/10.1109/CVPR46437.2021.00529
Acknowledgements
The work was supported in part by the National Key Research and Development Program of China under Grant No. 2021YFB2501300 and in part by the National Important Science & Technology Specific Projects under Grant No. 2017ZX01038201.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ye, S., Yin, Z., Fu, Y. et al. A multi-granularity semisupervised active learning for point cloud semantic segmentation. Neural Comput & Applic 35, 15629–15645 (2023). https://doi.org/10.1007/s00521-023-08455-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08455-7