Skip to main content

Test-Time Adaptation with Calibration of Medical Image Classification Nets for Label Distribution Shift

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2022 (MICCAI 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13433))

  • 7444 Accesses

Abstract

Class distribution plays an important role in learning deep classifiers. When the proportion of each class in the test set differs from the training set, the performance of classification nets usually degrades. Such a label distribution shift problem is common in medical diagnosis since the prevalence of disease vary over location and time. In this paper, we propose the first method to tackle label shift for medical image classification, which effectively adapt the model learned from a single training label distribution to arbitrary unknown test label distribution. Our approach innovates distribution calibration to learn multiple representative classifiers, which are capable of handling different one-dominating-class distributions. When given a test image, the diverse classifiers are dynamically aggregated via the consistency-driven test-time adaptation, to deal with the unknown test label distribution. We validate our method on two important medical image classification tasks including liver fibrosis staging and COVID-19 severity prediction. Our experiments clearly show the decreased model performance under label shift. With our method, model performance significantly improves on all the test datasets with different label shifts for both medical image diagnosis tasks. Code is available at https://github.com/med-air/TTADC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Azizzadenesheli, K., Liu, A., Yang, F., Anandkumar, A.: Regularized learning for domain adaptation under label shifts. In: International Conference on Learning Representations (2019)

    Google Scholar 

  2. Bao, G., et al.: COVID-MTL: multitask learning with Shift3D and random-weighted loss for COVID-19 diagnosis and severity assessment. Pattern Recogn. 124, 108499 (2022)

    Article  Google Scholar 

  3. Challen, R., Denny, J., Pitt, M., Gompels, L., Edwards, T., Tsaneva-Atanasova, K.: Artificial intelligence, bias and clinical safety. BMJ Qual. Saf. 28(3), 231–237 (2019)

    Article  Google Scholar 

  4. Chen, I.Y., Joshi, S., Ghassemi, M., Ranganath, R.: Probabilistic machine learning for healthcare. Annu. Rev. Biomed. Data Sci. 4, 393–415 (2021)

    Article  Google Scholar 

  5. Choi, K.J., et al.: Development and validation of a deep learning system for staging liver fibrosis by using contrast agent-enhanced CT images in the liver. Radiology 289(3), 688–697 (2018)

    Article  Google Scholar 

  6. Davis, S.E., Lasko, T.A., Chen, G., Siew, E.D., Matheny, M.E.: Calibration drift in regression and machine learning models for acute kidney injury. J. Am. Med. Inform. Assoc. 24(6), 1052–1061 (2017)

    Article  Google Scholar 

  7. Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(4), 463–484 (2011)

    Google Scholar 

  8. Hong, Y., Han, S., Choi, K., Seo, S., Kim, B., Chang, B.: Disentangling label distribution for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6626–6636 (2021)

    Google Scholar 

  9. Hussein, S., Kandel, P., Bolan, C.W., Wallace, M.B., Bagci, U.: Lung and pancreatic tumor characterization in the deep learning era: novel supervised and unsupervised learning approaches. IEEE Trans. Med. Imaging 38(8), 1777–1787 (2019)

    Article  Google Scholar 

  10. Kang, B., et al.: Decoupling representation and classifier for long-tailed recognition. In: International Conference on Learning Representations (2020)

    Google Scholar 

  11. Konwer, A., et al.: Attention-based multi-scale gated recurrent encoder with novel correlation loss for COVID-19 progression prediction. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12905, pp. 824–833. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87240-3_79

    Chapter  Google Scholar 

  12. Lambert, J., Halfon, P., Penaranda, G., Bedossa, P., Cacoub, P., Carrat, F.: How to measure the diagnostic accuracy of noninvasive liver fibrosis indices: the area under the ROC curve revisited. Clin. Chem. 54(8), 1372–1378 (2008)

    Article  Google Scholar 

  13. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  14. Liu, M., Zhang, D., Shen, D.: Relationship induced multi-template learning for diagnosis of Alzheimer’s disease and mild cognitive impairment. IEEE Trans. Med. Imaging 35(6), 1463–1474 (2016)

    Article  Google Scholar 

  15. Mesejo, P., et al.: Computer-aided classification of gastrointestinal lesions in regular colonoscopy. IEEE Trans. Med. Imaging 35(9), 2051–2063 (2016)

    Article  Google Scholar 

  16. Moreno-Torres, J.G., Raeder, T., Alaiz-Rodríguez, R., et al.: A unifying view on dataset shift in classification. Pattern Recogn. 45(1), 521–530 (2012)

    Article  Google Scholar 

  17. Ning, W., et al.: Open resource of clinical data from patients with pneumonia for the prediction of COVID-19 outcomes via deep learning. Nat. Biomed. Eng. 4(12), 1197–1207 (2020)

    Article  Google Scholar 

  18. Niu, Z., Zhou, M., Wang, L., Gao, X., Hua, G.: Ordinal regression with multiple output CNN for age estimation. In: CVPR, pp. 4920–4928 (2016)

    Google Scholar 

  19. Obuchowski, N.A., Goske, M.J., Applegate, K.E.: Assessing physicians’ accuracy in diagnosing paediatric patients with acute abdominal pain: measuring accuracy for multiple diseases. Stat. Med. 20(21), 3261–3278 (2001)

    Article  Google Scholar 

  20. Park, C., Awadalla, A., Kohno, T., Patel, S.: Reliable and trustworthy machine learning for health using dataset shift detection. In: NeurIPS, vol. 34 (2021)

    Google Scholar 

  21. Park, H.J., et al.: Radiomics analysis of gadoxetic acid-enhanced MRI for staging liver fibrosis. Radiology 290(2), 380–387 (2019)

    Article  Google Scholar 

  22. Peng, J., Bu, X., Sun, M., Zhang, Z., Tan, T., Yan, J.: Large-scale object detection in the wild from imbalanced multi-labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9709–9718 (2020)

    Google Scholar 

  23. Ren, J., Hacihaliloglu, I., Singer, E.A., Foran, D.J., Qi, X.: Adversarial domain adaptation for classification of prostate histopathology whole-slide images. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11071, pp. 201–209. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00934-2_23

    Chapter  Google Scholar 

  24. Ren, J., Yu, C., Ma, X., Zhao, H., Yi, S., et al.: Balanced meta-softmax for long-tailed visual recognition. Adv. Neural. Inf. Process. Syst. 33, 4175–4186 (2020)

    Google Scholar 

  25. Roy, S., et al.: Deep learning for classification and localization of COVID-19 markers in point-of-care lung ultrasound. IEEE Trans. Med. Imaging 39(8), 2676–2687 (2020)

    Article  Google Scholar 

  26. Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., Mooij, J.: On causal and anticausal learning. In: ICML (2012)

    Google Scholar 

  27. Subbaswamy, A., Saria, S.: From development to deployment: dataset shift, causality, and shift-stable models in health AI. Biostatistics 21(2), 345–352 (2020)

    MathSciNet  Google Scholar 

  28. Sun, Y., Wang, X., Liu, Z., Miller, J., Efros, A., Hardt, M.: Test-time training with self-supervision for generalization under distribution shifts. In: International Conference on Machine Learning, pp. 9229–9248. PMLR (2020)

    Google Scholar 

  29. Wang, D., Shelhamer, E., Liu, S., Olshausen, B., Darrell, T.: Tent: fully test-time adaptation by entropy minimization. In: International Conference on Learning Representations ICLR (2021)

    Google Scholar 

  30. Wang, X., Lian, L., Miao, Z., Liu, Z., Yu, S.X.: Long-tailed recognition by routing diverse distribution-aware experts. In: International Conference on Learning Representations (2021)

    Google Scholar 

  31. Williams, R.: Global challenges in liver disease. Hepatology 44(3), 521–526 (2006)

    Article  Google Scholar 

  32. Wu, R., Guo, C., Su, Y., Weinberger, K.Q.: Online adaptation to label distribution shift. In: Advances in Neural Information Processing Systems, vol. 34 (2021)

    Google Scholar 

  33. Zhang, K., Schölkopf, B., Muandet, K., Wang, Z.: Domain adaptation under target and conditional shift. In: ICML, pp. 819–827. PMLR (2013)

    Google Scholar 

  34. Zhang, Y., Hooi, B., Hong, L., Feng, J.: Test-agnostic long-tailed recognition by test-time aggregating diverse experts with self-supervision. arXiv preprint arXiv:2107.09249 (2021)

Download references

Acknowledgement

This work was supported in part by the Hong Kong Innovation and Technology Fund (Project No. ITS/238/21), in part by the CUHK Shun Hing Institute of Advanced Engineering (project MMT-p5-20), in part by the Shenzhen-HK Collaborative Development Zone, in part by Jilin Provincial Key Laboratory of Medical Imaging & Big Data (20200601003JC), Radiology, and in part by Technology Innovation Center of Jilin Province (20190902016TC).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Huimao Zhang or Qi Dou .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 988 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ma, W., Chen, C., Zheng, S., Qin, J., Zhang, H., Dou, Q. (2022). Test-Time Adaptation with Calibration of Medical Image Classification Nets for Label Distribution Shift. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13433. Springer, Cham. https://doi.org/10.1007/978-3-031-16437-8_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16437-8_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16436-1

  • Online ISBN: 978-3-031-16437-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics