Abstract
Machine learning driven medical image segmentation has become standard in medical image analysis. However, deep learning models are prone to overconfident predictions. This has lead to a renewed focus on calibrated predictions in the medical imaging and broader machine learning communities. Calibrated predictions are estimates of the probability of a label that correspond to the true expected value of the label conditioned on the confidence. Such calibrated predictions have utility in a range of medical imaging applications, including surgical planning under uncertainty and active learning systems. At the same time it is often an accurate volume measurement that is of real importance for many medical applications. This work investigates the relationship between model calibration and volume estimation. We demonstrate both mathematically and empirically that if the predictor is calibrated per image, we can obtain the correct volume by taking an expectation of the probability scores per pixel/voxel of the image. Furthermore, we show that linear combinations of calibrated classifiers preserve volume estimation, but do not preserve calibration. Therefore, we conclude that having a calibrated predictor is a sufficient, but not necessary condition for obtaining an unbiased estimate of the volume. We validate our theoretical findings empirically on a collection of 18 different (calibrated) training strategies on the tasks of glioma volume estimation on BraTS 2018, and ischemic stroke lesion volume estimation on ISLES 2018 datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bakas, S., Akbari, H., Sotiras, A., et al.: Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci. Data 4, 1–13 (2017)
Bakas, S., Reyes, M., Jakab, A., et al.: Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge (2018)
Baumgartner, C.F., et al.: PHiSeg: capturing uncertainty in medical image segmentation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11765, pp. 119–127. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32245-8_14
Bertels, J., Robben, D., Vandermeulen, D., Suetens, P.: Theoretical analysis and experimental validation of volume bias of soft dice optimized segmentation maps in the context of inherent uncertainty. Med. Image Anal. 67, 101833 (2021)
Bowley, A.L.: The standard deviation of the correlation coefficient. J. Am. Stat. Assoc. 23(161), 31–34 (1928)
Demeestere, J., Garcia-Esperon, C., Garcia-Bermejo, P., et al.: Evaluation of hyperacute infarct volume using ASPECTS and brain CT perfusion core volume. Neurology 88(24), 2248–2253 (2017)
Dubben, H.H., Thames, H.D., Beck-Bornholdt, H.P.: Tumor volume: a basic and specific response predictor in radiotherapy. Radiother. Oncol. 47(2), 167–174 (1998)
Eaton-Rosen, Z., Bragman, F., Bisdas, S., Ourselin, S., Cardoso, M.J.: Towards safe deep learning: accurately quantifying biomarker uncertainty in neural network predictions. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 691–699. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_78
Eelbode, T., Bertels, J., Berman, M., et al.: Optimization for medical image segmentation: Theory and practice when evaluating with Dice score or Jaccard index. IEEE Trans. Med. Imaging 39(11), 3679–3690 (2020)
Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of the 33rd International Conference on Machine Learning, pp. 1050–1059 (2016)
Gillmann, C., Maack, R.G., Post, T., Wischgoll, T., Hagen, H.: An uncertainty-aware workflow for keyhole surgery planning using hierarchical image semantics. Vis. Inf. 2(1), 26–36 (2018)
Goyal, M., Menon, B.K., Zwam, W.H.V., et al.: Endovascular thrombectomy after large-vessel Ischaemic stroke: a meta-analysis of individual patient data from five randomised trials. The Lancet 387(10029), 1723–1731 (2016)
Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1321–1330 (2017)
Isensee, F., Kickingereder, P., Wick, W., Bendszus, M., Maier-Hein, K.H.: No new-net. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11384, pp. 234–244. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11726-9_21
Jungo, A., Balsiger, F., Reyes, M.: Analyzing the quality and challenges of uncertainty estimations for brain tumor segmentation. Front. Neurosci. 14, 282 (2020)
Kohl, S.A., Romera-Paredes, B., Meyer, C., et al.: A probabilistic U-net for segmentation of ambiguous images. In: Advances in Neural Information Processing Systems, pp. 6965–6975 (2018)
Kumar, A., Liang, P.S., Ma, T.: Verified uncertainty calibration. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32, pp. 3792–3803 (2019)
Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: Advances in Neural Information Processing Systems, pp. 6403–6414 (2017)
Lee, A.J.: U-Statistics: Theory and Practice. Taylor & Francis (1990)
Menze, B.H., Jakab, A., Bauer, S., et al.: The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imaging 34, 1993–2024 (2015)
Naeini, M.P., Cooper, G.F., Hauskrecht, M.: Obtaining well calibrated probabilities using Bayesian binning. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 2901–2907 (2015)
Neal, R.M.: Bayesian Learning for Neural Networks. Springer, New York (2012). https://doi.org/10.1007/978-1-4612-0745-0
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Rousseau, A.J., Becker, T., Bertels, J., Blaschko, M., Valkenborg, D.: Post training uncertainty calibration of deep networks for medical image segmentation. In: ISBI (2021)
Tilborghs, S., Maes, F.: Left ventricular parameter regression from deep feature maps of a jointly trained segmentation CNN. In: Pop, M., et al. (eds.) STACOM 2019. LNCS, vol. 12009, pp. 395–404. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39074-7_41
Wenger, J., Kjellström, H., Triebel, R.: Non-parametric calibration for classification. In: International Conference on Artificial Intelligence and Statistics, pp. 178–190 (2020)
Winzeck, S., Hakim, A., McKinley, R., et al.: ISLES 2016 and 2017-benchmarking ischemic stroke lesion outcome prediction based on multispectral MRI. Front. Neurol. 9, 679 (2018)
Wu, J., Ruan, S., Lian, C., et al.: Active learning with noise modeling for medical image annotation. In: ISBI, pp. 298–301 (2018)
Acknowledgments
This research received funding from the Flemish Government under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” programme. J.B. is part of NEXIS (www.nexis-project.eu), a project that has received funding from the European Union’s Horizon 2020 Research and Innovations Programme (Grant Agreement #780026).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Popordanoska, T., Bertels, J., Vandermeulen, D., Maes, F., Blaschko, M.B. (2021). On the Relationship Between Calibrated Predictors and Unbiased Volume Estimation. In: de Bruijne, M., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. MICCAI 2021. Lecture Notes in Computer Science(), vol 12901. Springer, Cham. https://doi.org/10.1007/978-3-030-87193-2_64
Download citation
DOI: https://doi.org/10.1007/978-3-030-87193-2_64
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87192-5
Online ISBN: 978-3-030-87193-2
eBook Packages: Computer ScienceComputer Science (R0)