Abstract
Purpose
Segmentation tasks are important for computer-assisted surgery systems as they provide the shapes of organs and the locations of instruments. What prevents the most powerful segmentation approaches from becoming practical applications is the requirement for annotated data. Active learning provides strategies to dynamically select the most informative samples to reduce the annotation workload. However, most previous active learning literature has failed to select the frames that containing low-appearing frequency classes, even though the existence of these classes is common in laparoscopic videos, resulting in poor performance in segmentation tasks. Furthermore, few previous works have explored the unselected data to improve active learning. Therefore, in this work, we focus on these classes to improve the segmentation performance.
Methods
We propose a class-wise confidence bank that stores and updates the confidence scores for each class and a new acquisition function based on a confidence bank. We apply confidence scores to explore an unlabeled dataset by combining it with a class-wise data mixture method to exploit unlabeled datasets without any annotation.
Results
We validated our proposal on two open-source datasets, CholecSeg8k and RobSeg2017, and observed that its performance surpassed previous active learning studies with about \(10\%\) improvement on CholecSeg8k, especially for classes with a low-appearing frequency. For robSeg2017, we conducted experiments with a small and large annotation budgets to validate situation that shows the effectiveness of our proposal.
Conclusions
We presented a class-wise confidence score to improve the acquisition function for active learning and explored unlabeled data with our proposed class-wise confidence score, which results in a large improvement over the compared methods. The experiments also showed that our proposal improved the segmentation performance for classes with a low-appearing frequency.






Similar content being viewed by others
Notes
The upper bound of active learning budget was set based on the performance in fully supervised learning settings.
References
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision, pp. 801–818
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp. 234–241 . Springer
Settles B (2009) Active learning literature survey
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(3):379–423
Gal Y, Islam R, Ghahramani Z (2017) Deep Bayesian active learning with image data. In: International conference on machine learning, pp. 1183–1192
Xie S, Feng Z, Chen Y, Sun S, Ma C, Song M (2020) Deal: difficulty-aware active learning for semantic segmentation. In: Proceedings of the Asian conference on computer vision, pp. 672–688
Yang L, Zhang Y, Chen J, Zhang S, Chen DZ (2017) Suggestive annotation: a deep active learning framework for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp. 399–407. Springer
Sener O, Savarese S (2018) Active learning for convolutional neural networks: a core-set approach. In: International conference on learning representations
Houlsby N, Huszár F, Ghahramani Z, Lengyel M (2011) Bayesian active learning for gaussian process classification. In: NIPS Workshop on Bayesian optimization, experimental design and bandits: theory and applications
Gal Y, Ghahramani Z (2016) Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: International conference on machine learning, pp. 1050–1059
Kendall A, Gal Y (2017) What uncertainties do we need in Bayesian deep learning for computer vision?. In: Advances in neural information processing systems, pp. 5574–5584
Olsson V, Tranheden W, Pinto J, Svensson L (2021) Classmix: segmentation-based data augmentation for semi-supervised learning. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 1369–1378
Allan M, Shvets A, Kurmann T, Zhang Z, Duggal R, Su Y-H, Rieke N, Laina I, Kalavakonda N, Bodenstedt S, Garcia-Peraza-Herrera L, Li W, Iglovikov V, Luo H, Yang J, Stoyanov D, Maier-Hein L, Speidel S, Azizian M (2017) robotic instrument segmentation challenge. arXiv preprint arXiv:1902.06426 (2019)
Hong W-Y, Kao C-L, Kuo Y-H, Wang J-R, Chang W-L, Shih C-S (2020) Cholecseg8k: a semantic segmentation dataset for laparoscopic cholecystectomy based on cholec80. arXiv preprint arXiv:2012.12453
Hu H, Wei F, Hu H, Ye Q, Cui J, Wang L (2021) Semi-supervised semantic segmentation via adaptive equalization learning. Adv Neural Inf Process Syst 34:22106–22118. https://papers.nips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105. https://proceedings.neurips.cc/paper/2021/hash/b98249b38337c5088bbc660d8f872d6a-Abstract.html
Yun S, Han D, Oh SJ, Chun S, Choe J, Yoo Y (2019) Cutmix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 6023–6032
French G, Laine S, Aila T, Mackiewicz M, Finlayson G (2020) Semi-supervised semantic segmentation needs strong, varied perturbations. In: British machine vision conference
Samuli L, Timo A (2017) Temporal ensembling for semi-supervised learning. In: International conference on learning representations
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Twinanda AP, Shehata S, Mutter D, Marescaux J, De Mathelin M, Padoy N (2016) Endonet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans Med Imaging 36(1):86–97
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778
Scheffer T, Decomain C, Wrobel S (2001) Active hidden markov models for information extraction. In: International symposium on intelligent data analysis, pp. 309–318 . Springer
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, pp. 1597–1607 . PMLR
Van Amersfoort J, Smith L, Teh YW, Gal Y (2020) Uncertainty estimation using a single deep deterministic neural network. In: International conference on machine learning, pp. 9690–9700. PMLR
Zhang B, Wang Y, Hou W, Wu H, Wang J, Okumura M, Shinozaki T (2021) Flexmatch: boosting semi-supervised learning with curriculum pseudo labeling. Adv Neural Inf Process Syst 34 :18408–18419. https://proceedings.neurips.cc/paper/2021 hash/995693c15f439e3d189b06e89d145dd5-Abstract.htm
Acknowledgements
This work was supported by the JST CREST Grant Number JPMJCR20D5, JSPS KAKENHI Grant Number 17H00867, and JSPS Bilateral Joint Research Project.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Qiu, J., Hayashi, Y., Oda, M. et al. Class-wise confidence-aware active learning for laparoscopic images segmentation. Int J CARS 18, 473–482 (2023). https://doi.org/10.1007/s11548-022-02773-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11548-022-02773-2