On the Relationship Between Calibrated Predictors and Unbiased Volume Estimation

Popordanoska, Teodora; Bertels, Jeroen; Vandermeulen, Dirk; Maes, Frederik; Blaschko, Matthew B.

doi:10.1007/978-3-030-87193-2_64

Teodora Popordanoska¹⁵,
Jeroen Bertels¹⁵,
Dirk Vandermeulen¹⁵,
Frederik Maes¹⁵ &
…
Matthew B. Blaschko¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12901))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

13k Accesses
5 Citations

Abstract

Machine learning driven medical image segmentation has become standard in medical image analysis. However, deep learning models are prone to overconfident predictions. This has lead to a renewed focus on calibrated predictions in the medical imaging and broader machine learning communities. Calibrated predictions are estimates of the probability of a label that correspond to the true expected value of the label conditioned on the confidence. Such calibrated predictions have utility in a range of medical imaging applications, including surgical planning under uncertainty and active learning systems. At the same time it is often an accurate volume measurement that is of real importance for many medical applications. This work investigates the relationship between model calibration and volume estimation. We demonstrate both mathematically and empirically that if the predictor is calibrated per image, we can obtain the correct volume by taking an expectation of the probability scores per pixel/voxel of the image. Furthermore, we show that linear combinations of calibrated classifiers preserve volume estimation, but do not preserve calibration. Therefore, we conclude that having a calibrated predictor is a sufficient, but not necessary condition for obtaining an unbiased estimate of the volume. We validate our theoretical findings empirically on a collection of 18 different (calibrated) training strategies on the tasks of glioma volume estimation on BraTS 2018, and ischemic stroke lesion volume estimation on ISLES 2018 datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

DOMINO: Domain-Aware Model Calibration in Medical Image Segmentation

Towards Safe Deep Learning: Accurately Quantifying Biomarker Uncertainty in Neural Network Predictions

Assessing Reliability and Challenges of Uncertainty Estimations for Medical Image Segmentation

References

Bakas, S., Akbari, H., Sotiras, A., et al.: Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci. Data 4, 1–13 (2017)
Article Google Scholar
Bakas, S., Reyes, M., Jakab, A., et al.: Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge (2018)
Google Scholar
Baumgartner, C.F., et al.: PHiSeg: capturing uncertainty in medical image segmentation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11765, pp. 119–127. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32245-8_14
Chapter Google Scholar
Bertels, J., Robben, D., Vandermeulen, D., Suetens, P.: Theoretical analysis and experimental validation of volume bias of soft dice optimized segmentation maps in the context of inherent uncertainty. Med. Image Anal. 67, 101833 (2021)
Article Google Scholar
Bowley, A.L.: The standard deviation of the correlation coefficient. J. Am. Stat. Assoc. 23(161), 31–34 (1928)
Article Google Scholar
Demeestere, J., Garcia-Esperon, C., Garcia-Bermejo, P., et al.: Evaluation of hyperacute infarct volume using ASPECTS and brain CT perfusion core volume. Neurology 88(24), 2248–2253 (2017)
Article Google Scholar
Dubben, H.H., Thames, H.D., Beck-Bornholdt, H.P.: Tumor volume: a basic and specific response predictor in radiotherapy. Radiother. Oncol. 47(2), 167–174 (1998)
Article Google Scholar
Eaton-Rosen, Z., Bragman, F., Bisdas, S., Ourselin, S., Cardoso, M.J.: Towards safe deep learning: accurately quantifying biomarker uncertainty in neural network predictions. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 691–699. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_78
Chapter Google Scholar
Eelbode, T., Bertels, J., Berman, M., et al.: Optimization for medical image segmentation: Theory and practice when evaluating with Dice score or Jaccard index. IEEE Trans. Med. Imaging 39(11), 3679–3690 (2020)
Article Google Scholar
Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of the 33rd International Conference on Machine Learning, pp. 1050–1059 (2016)
Google Scholar
Gillmann, C., Maack, R.G., Post, T., Wischgoll, T., Hagen, H.: An uncertainty-aware workflow for keyhole surgery planning using hierarchical image semantics. Vis. Inf. 2(1), 26–36 (2018)
Google Scholar
Goyal, M., Menon, B.K., Zwam, W.H.V., et al.: Endovascular thrombectomy after large-vessel Ischaemic stroke: a meta-analysis of individual patient data from five randomised trials. The Lancet 387(10029), 1723–1731 (2016)
Article Google Scholar
Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1321–1330 (2017)
Google Scholar
Isensee, F., Kickingereder, P., Wick, W., Bendszus, M., Maier-Hein, K.H.: No new-net. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11384, pp. 234–244. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11726-9_21
Chapter Google Scholar
Jungo, A., Balsiger, F., Reyes, M.: Analyzing the quality and challenges of uncertainty estimations for brain tumor segmentation. Front. Neurosci. 14, 282 (2020)
Article Google Scholar
Kohl, S.A., Romera-Paredes, B., Meyer, C., et al.: A probabilistic U-net for segmentation of ambiguous images. In: Advances in Neural Information Processing Systems, pp. 6965–6975 (2018)
Google Scholar
Kumar, A., Liang, P.S., Ma, T.: Verified uncertainty calibration. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32, pp. 3792–3803 (2019)
Google Scholar
Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: Advances in Neural Information Processing Systems, pp. 6403–6414 (2017)
Google Scholar
Lee, A.J.: U-Statistics: Theory and Practice. Taylor & Francis (1990)
Google Scholar
Menze, B.H., Jakab, A., Bauer, S., et al.: The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imaging 34, 1993–2024 (2015)
Article Google Scholar
Naeini, M.P., Cooper, G.F., Hauskrecht, M.: Obtaining well calibrated probabilities using Bayesian binning. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 2901–2907 (2015)
Google Scholar
Neal, R.M.: Bayesian Learning for Neural Networks. Springer, New York (2012). https://doi.org/10.1007/978-1-4612-0745-0
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Rousseau, A.J., Becker, T., Bertels, J., Blaschko, M., Valkenborg, D.: Post training uncertainty calibration of deep networks for medical image segmentation. In: ISBI (2021)
Google Scholar
Tilborghs, S., Maes, F.: Left ventricular parameter regression from deep feature maps of a jointly trained segmentation CNN. In: Pop, M., et al. (eds.) STACOM 2019. LNCS, vol. 12009, pp. 395–404. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39074-7_41
Chapter Google Scholar
Wenger, J., Kjellström, H., Triebel, R.: Non-parametric calibration for classification. In: International Conference on Artificial Intelligence and Statistics, pp. 178–190 (2020)
Google Scholar
Winzeck, S., Hakim, A., McKinley, R., et al.: ISLES 2016 and 2017-benchmarking ischemic stroke lesion outcome prediction based on multispectral MRI. Front. Neurol. 9, 679 (2018)
Article Google Scholar
Wu, J., Ruan, S., Lian, C., et al.: Active learning with noise modeling for medical image annotation. In: ISBI, pp. 298–301 (2018)
Google Scholar

Download references

Acknowledgments

This research received funding from the Flemish Government under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” programme. J.B. is part of NEXIS (www.nexis-project.eu), a project that has received funding from the European Union’s Horizon 2020 Research and Innovations Programme (Grant Agreement #780026).

Author information

Authors and Affiliations

Center for Processing Speech and Images, Department of ESAT, KU Leuven, Leuven, Belgium
Teodora Popordanoska, Jeroen Bertels, Dirk Vandermeulen, Frederik Maes & Matthew B. Blaschko

Authors

Teodora Popordanoska
View author publications
You can also search for this author in PubMed Google Scholar
Jeroen Bertels
View author publications
You can also search for this author in PubMed Google Scholar
Dirk Vandermeulen
View author publications
You can also search for this author in PubMed Google Scholar
Frederik Maes
View author publications
You can also search for this author in PubMed Google Scholar
Matthew B. Blaschko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Teodora Popordanoska .

Editor information

Editors and Affiliations

Erasmus MC - University Medical Center Rotterdam, Rotterdam, The Netherlands
Marleen de Bruijne
University of Basel, Allschwil, Switzerland
Philippe C. Cattin
Inria Nancy Grand Est, Villers-lès-Nancy, France
Stéphane Cotin
ICube, Université de Strasbourg, CNRS, Strasbourg, France
Nicolas Padoy
National Center for Tumor Diseases (NCT/UCC), Dresden, Germany
Stefanie Speidel
Tencent Jarvis Lab, Shenzhen, China
Yefeng Zheng
ICube, Université de Strasbourg, CNRS, Strasbourg, France
Caroline Essert

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 272 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Popordanoska, T., Bertels, J., Vandermeulen, D., Maes, F., Blaschko, M.B. (2021). On the Relationship Between Calibrated Predictors and Unbiased Volume Estimation. In: de Bruijne, M., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. MICCAI 2021. Lecture Notes in Computer Science(), vol 12901. Springer, Cham. https://doi.org/10.1007/978-3-030-87193-2_64

Download citation

DOI: https://doi.org/10.1007/978-3-030-87193-2_64
Published: 21 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87192-5
Online ISBN: 978-3-030-87193-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)