Skip to main content

Learning Robust Classifier for Imbalanced Medical Image Dataset with Noisy Labels by Minimizing Invariant Risk

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 (MICCAI 2023)

Abstract

In medical image analysis, imbalanced noisy dataset classification poses a long-standing and critical problem since clinical large-scale datasets often attain noisy labels and imbalanced distributions through annotation and collection. Current approaches addressing noisy labels and long-tailed distributions separately may negatively impact real-world practices. Additionally, the factor of class hardness hindering label noise removal remains undiscovered, causing a critical necessity for an approach to enhance the classification performance of noisy imbalanced medical datasets with various class hardness. To address this paradox, we propose a robust classifier that trains on a multi-stage noise removal framework, which jointly rectifies the adverse effects of label noise, imbalanced distribution, and class hardness. The proposed noise removal framework consists of multiple phases. Multi-Environment Risk Minimization (MER) strategy captures data-to-label causal features for noise identification, and the Rescaling Class-aware Gaussian Mixture Modeling (RCGM) learns class-invariant detection mappings for noise removal. Extensive experiments on two imbalanced noisy clinical datasets demonstrate the capability and potential of our method for boosting the performance of medical image classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Arazo, E., Ortego, D., Albert, P., O’Connor, N.E., McGuinness, K.: Unsupervised label noise modeling and loss correction. In: Chaudhuri, K., Salakhutdinov, R. (eds.) ICML 2019 (2019)

    Google Scholar 

  2. Arjovsky, M., Bottou, L., Gulrajani, I., Lopez-Paz, D.: Invariant risk minimization. arXiv preprint arXiv:1907.02893 (2019)

  3. Chen, P., Liao, B.B., Chen, G., Zhang, S.: Understanding and utilizing deep neural networks trained with noisy labels. In: ICML (2019)

    Google Scholar 

  4. Chen, X., Gupta, A.: Webly supervised learning of convolutional networks. In: ICCV (2015)

    Google Scholar 

  5. Cui, Y., Jia, M., Lin, T., Song, Y., Belongie, S.J.: Class-balanced loss based on effective number of samples. In: CVPR (2019)

    Google Scholar 

  6. Foret, P., Kleiner, A., Mobahi, H., Neyshabur, B.: Sharpness-aware minimization for efficiently improving generalization. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, 3–7 May 2021. OpenReview.net (2021). https://openreview.net/forum?id=6Tm1mposlrM

  7. Frénay, B., Verleysen, M.: Classification in the presence of label noise: a survey. IEEE TNNLS 25(5), 845–869 (2013)

    Google Scholar 

  8. Huang, Y., Bai, B., Zhao, S., Bai, K., Wang, F.: Uncertainty-aware learning against label noise on imbalanced datasets. In: Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 Virtual Event, 22 February–1 March 2022, pp. 6960–6969. AAAI Press (2022). https://ojs.aaai.org/index.php/AAAI/article/view/20654

  9. Kang, B., et al.: Decoupling representation and classifier for long-tailed recognition. arXiv preprint arXiv:1910.09217 (2019)

  10. Kang, B., et al.: Decoupling representation and classifier for long-tailed recognition. In: ICLR (2020)

    Google Scholar 

  11. Karimi, D., Dou, H., Warfield, S.K., Gholipour, A.: Deep learning with noisy labels: exploring techniques and remedies in medical image analysis. Med. Image Anal. 65, 101759 (2020)

    Article  Google Scholar 

  12. Li, J., et al.: Flat-aware cross-stage distilled framework for imbalanced medical image classification. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13433, pp. 217–226. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16437-8_21

    Chapter  Google Scholar 

  13. Li, J., Socher, R., Hoi, S.C.H.: Dividemix: learning with noisy labels as semi-supervised learning. In: ICLR 2020 (2020)

    Google Scholar 

  14. Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)

    Google Scholar 

  15. Liu, J., Sun, Y., Han, C., Dou, Z., Li, W.: Deep representation learning on long-tailed data: a learnable embedding augmentation perspective. In: CVPR (2020)

    Google Scholar 

  16. Ma, X., Huang, H., Wang, Y., Romano, S., Erfani, S.M., Bailey, J.: Normalized loss functions for deep learning with noisy labels. In: ICML 2020 (2020)

    Google Scholar 

  17. Mahajan, D., et al.: Exploring the limits of weakly supervised pretraining. In: ECCV (2018)

    Google Scholar 

  18. Song, H., Kim, M., Park, D., Shin, Y., Lee, J.G.: Learning from noisy labels with deep neural networks: a survey. IEEE TNNLS (2022)

    Google Scholar 

  19. Tan, C., Xia, J., Wu, L., Li, S.Z.: Co-learning: learning from noisy labels with self-supervision. In: Shen, H.T., et al. (eds.) ACM 2021 (2021)

    Google Scholar 

  20. Tan, J., Lu, X., Zhang, G., Yin, C., Li, Q.: Equalization loss V2: a new gradient balance approach for long-tailed object detection. In: CVPR (2021)

    Google Scholar 

  21. Tan, J., et al.: Equalization loss for long-tailed object recognition. In: CVPR 2020 (2020)

    Google Scholar 

  22. Tschandl, P., Rosendahl, C., Kittler, H.: The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data 5(1), 1–9 (2018)

    Article  Google Scholar 

  23. Xue, C., Dou, Q., Shi, X., Chen, H., Heng, P.A.: Robust learning at noisy labeled medical images: applied to skin lesion classification. In: ISBI 2019 (2019)

    Google Scholar 

  24. Xue, C., Yu, L., Chen, P., Dou, Q., Heng, P.A.: Robust medical image classification from noisy labeled data with global and local representation guided co-training. IEEE TMI 41(6), 1371–1382 (2022)

    Google Scholar 

  25. Yi, X., Tang, K., Hua, X.S., Lim, J.H., Zhang, H.: Identifying hard noise in long-tailed sample distribution. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13686, pp. 739–756. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19809-0_42

    Chapter  Google Scholar 

  26. Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)

  27. Zhang, Z., Sabuncu, M.R.: Generalized cross entropy loss for training deep neural networks with noisy labels. In: Bengio, S., Wallach, H.M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) NIPS 2018 (2018)

    Google Scholar 

  28. Zhong, Z., Cui, J., Liu, S., Jia, J.: Improving calibration for long-tailed recognition. In: CVPR 2021 (2021)

    Google Scholar 

  29. Zhu, C., Chen, W., Peng, T., Wang, Y., Jin, M.: Hard sample aware noise robust learning for histopathology image classification. IEEE TMI 41(4), 881–894 (2021)

    Google Scholar 

Download references

Acknowlegdement

This work described in this paper was supported in part by the Shenzhen Portion of Shenzhen-Hong Kong Science and Technology Innovation Cooperation Zone under HZQB-KCZYB-20200089. The work was also partially supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project Number: T45-401/22-N) and by a grant from the Hong Kong Innovation and Technology Fund (Project Number: GHP/080/20SZ). The work was also partially supported by a grant from the National Key R &D Program of China (2022YFE0200700), a grant from the National Natural Science Foundation of China (Project No. 62006219), and a grant from the Natural Science Foundation of Guangdong Province (2022A1515011579).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guangyong Chen .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 17127 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, J. et al. (2023). Learning Robust Classifier for Imbalanced Medical Image Dataset with Noisy Labels by Minimizing Invariant Risk. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14225. Springer, Cham. https://doi.org/10.1007/978-3-031-43987-2_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43987-2_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43986-5

  • Online ISBN: 978-3-031-43987-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics