Learning Robust Classifier for Imbalanced Medical Image Dataset with Noisy Labels by Minimizing Invariant Risk

Li, Jinpeng; Cao, Hanqun; Wang, Jiaze; Liu, Furui; Dou, Qi; Chen, Guangyong; Heng, Pheng-Ann

doi:10.1007/978-3-031-43987-2_30

Jinpeng Li¹⁴,
Hanqun Cao¹⁴,
Jiaze Wang¹⁴,
Furui Liu¹⁶,
Qi Dou^14,15,
Guangyong Chen¹⁶ &
…
Pheng-Ann Heng^14,15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14225))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

6204 Accesses
2 Citations

Abstract

In medical image analysis, imbalanced noisy dataset classification poses a long-standing and critical problem since clinical large-scale datasets often attain noisy labels and imbalanced distributions through annotation and collection. Current approaches addressing noisy labels and long-tailed distributions separately may negatively impact real-world practices. Additionally, the factor of class hardness hindering label noise removal remains undiscovered, causing a critical necessity for an approach to enhance the classification performance of noisy imbalanced medical datasets with various class hardness. To address this paradox, we propose a robust classifier that trains on a multi-stage noise removal framework, which jointly rectifies the adverse effects of label noise, imbalanced distribution, and class hardness. The proposed noise removal framework consists of multiple phases. Multi-Environment Risk Minimization (MER) strategy captures data-to-label causal features for noise identification, and the Rescaling Class-aware Gaussian Mixture Modeling (RCGM) learns class-invariant detection mappings for noise removal. Extensive experiments on two imbalanced noisy clinical datasets demonstrate the capability and potential of our method for boosting the performance of medical image classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Active Label Refinement for Robust Training of Imbalanced Medical Image Classification Tasks in the Presence of High Label Noise

HSALC: hard sample aware label correction for medical image classification

Article 02 September 2024

Open-Set Semi-supervised Medical Image Classification with Learnable Prototypes and Outlier Filter

References

Arazo, E., Ortego, D., Albert, P., O’Connor, N.E., McGuinness, K.: Unsupervised label noise modeling and loss correction. In: Chaudhuri, K., Salakhutdinov, R. (eds.) ICML 2019 (2019)
Google Scholar
Arjovsky, M., Bottou, L., Gulrajani, I., Lopez-Paz, D.: Invariant risk minimization. arXiv preprint arXiv:1907.02893 (2019)
Chen, P., Liao, B.B., Chen, G., Zhang, S.: Understanding and utilizing deep neural networks trained with noisy labels. In: ICML (2019)
Google Scholar
Chen, X., Gupta, A.: Webly supervised learning of convolutional networks. In: ICCV (2015)
Google Scholar
Cui, Y., Jia, M., Lin, T., Song, Y., Belongie, S.J.: Class-balanced loss based on effective number of samples. In: CVPR (2019)
Google Scholar
Foret, P., Kleiner, A., Mobahi, H., Neyshabur, B.: Sharpness-aware minimization for efficiently improving generalization. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, 3–7 May 2021. OpenReview.net (2021). https://openreview.net/forum?id=6Tm1mposlrM
Frénay, B., Verleysen, M.: Classification in the presence of label noise: a survey. IEEE TNNLS 25(5), 845–869 (2013)
Google Scholar
Huang, Y., Bai, B., Zhao, S., Bai, K., Wang, F.: Uncertainty-aware learning against label noise on imbalanced datasets. In: Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 Virtual Event, 22 February–1 March 2022, pp. 6960–6969. AAAI Press (2022). https://ojs.aaai.org/index.php/AAAI/article/view/20654
Kang, B., et al.: Decoupling representation and classifier for long-tailed recognition. arXiv preprint arXiv:1910.09217 (2019)
Kang, B., et al.: Decoupling representation and classifier for long-tailed recognition. In: ICLR (2020)
Google Scholar
Karimi, D., Dou, H., Warfield, S.K., Gholipour, A.: Deep learning with noisy labels: exploring techniques and remedies in medical image analysis. Med. Image Anal. 65, 101759 (2020)
Article Google Scholar
Li, J., et al.: Flat-aware cross-stage distilled framework for imbalanced medical image classification. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13433, pp. 217–226. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16437-8_21
Chapter Google Scholar
Li, J., Socher, R., Hoi, S.C.H.: Dividemix: learning with noisy labels as semi-supervised learning. In: ICLR 2020 (2020)
Google Scholar
Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)
Google Scholar
Liu, J., Sun, Y., Han, C., Dou, Z., Li, W.: Deep representation learning on long-tailed data: a learnable embedding augmentation perspective. In: CVPR (2020)
Google Scholar
Ma, X., Huang, H., Wang, Y., Romano, S., Erfani, S.M., Bailey, J.: Normalized loss functions for deep learning with noisy labels. In: ICML 2020 (2020)
Google Scholar
Mahajan, D., et al.: Exploring the limits of weakly supervised pretraining. In: ECCV (2018)
Google Scholar
Song, H., Kim, M., Park, D., Shin, Y., Lee, J.G.: Learning from noisy labels with deep neural networks: a survey. IEEE TNNLS (2022)
Google Scholar
Tan, C., Xia, J., Wu, L., Li, S.Z.: Co-learning: learning from noisy labels with self-supervision. In: Shen, H.T., et al. (eds.) ACM 2021 (2021)
Google Scholar
Tan, J., Lu, X., Zhang, G., Yin, C., Li, Q.: Equalization loss V2: a new gradient balance approach for long-tailed object detection. In: CVPR (2021)
Google Scholar
Tan, J., et al.: Equalization loss for long-tailed object recognition. In: CVPR 2020 (2020)
Google Scholar
Tschandl, P., Rosendahl, C., Kittler, H.: The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data 5(1), 1–9 (2018)
Article Google Scholar
Xue, C., Dou, Q., Shi, X., Chen, H., Heng, P.A.: Robust learning at noisy labeled medical images: applied to skin lesion classification. In: ISBI 2019 (2019)
Google Scholar
Xue, C., Yu, L., Chen, P., Dou, Q., Heng, P.A.: Robust medical image classification from noisy labeled data with global and local representation guided co-training. IEEE TMI 41(6), 1371–1382 (2022)
Google Scholar
Yi, X., Tang, K., Hua, X.S., Lim, J.H., Zhang, H.: Identifying hard noise in long-tailed sample distribution. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13686, pp. 739–756. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19809-0_42
Chapter Google Scholar
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)
Zhang, Z., Sabuncu, M.R.: Generalized cross entropy loss for training deep neural networks with noisy labels. In: Bengio, S., Wallach, H.M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) NIPS 2018 (2018)
Google Scholar
Zhong, Z., Cui, J., Liu, S., Jia, J.: Improving calibration for long-tailed recognition. In: CVPR 2021 (2021)
Google Scholar
Zhu, C., Chen, W., Peng, T., Wang, Y., Jin, M.: Hard sample aware noise robust learning for histopathology image classification. IEEE TMI 41(4), 881–894 (2021)
Google Scholar

Download references

Acknowlegdement

This work described in this paper was supported in part by the Shenzhen Portion of Shenzhen-Hong Kong Science and Technology Innovation Cooperation Zone under HZQB-KCZYB-20200089. The work was also partially supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project Number: T45-401/22-N) and by a grant from the Hong Kong Innovation and Technology Fund (Project Number: GHP/080/20SZ). The work was also partially supported by a grant from the National Key R &D Program of China (2022YFE0200700), a grant from the National Natural Science Foundation of China (Project No. 62006219), and a grant from the Natural Science Foundation of Guangdong Province (2022A1515011579).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
Jinpeng Li, Hanqun Cao, Jiaze Wang, Qi Dou & Pheng-Ann Heng
Institute of Medical Intelligence and XR, The Chinese University of Hong Kong, Shatin, Hong Kong
Qi Dou & Pheng-Ann Heng
Zhejiang Lab, Hangzhou, China
Furui Liu & Guangyong Chen

Authors

Jinpeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Hanqun Cao
View author publications
You can also search for this author in PubMed Google Scholar
Jiaze Wang
View author publications
You can also search for this author in PubMed Google Scholar
Furui Liu
View author publications
You can also search for this author in PubMed Google Scholar
Qi Dou
View author publications
You can also search for this author in PubMed Google Scholar
Guangyong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Pheng-Ann Heng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guangyong Chen .

Editor information

Editors and Affiliations

Icahn School of Medicine, Mount Sinai, NYC, NY, USA, Tel Aviv University, Tel Aviv, Israel
Hayit Greenspan
Emory University, Atlanta, GA, USA
Anant Madabhushi
Queen’s University, Kingston, ON, Canada
Parvin Mousavi
The University of British Columbia, Vancouver, BC, Canada
Septimiu Salcudean
Yale University, New Haven, CT, USA
James Duncan
IBM Research, San Jose, CA, USA
Tanveer Syeda-Mahmood
Johns Hopkins University, Baltimore, MD, USA
Russell Taylor

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 17127 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, J. et al. (2023). Learning Robust Classifier for Imbalanced Medical Image Dataset with Noisy Labels by Minimizing Invariant Risk. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14225. Springer, Cham. https://doi.org/10.1007/978-3-031-43987-2_30

Download citation

DOI: https://doi.org/10.1007/978-3-031-43987-2_30
Published: 01 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43986-5
Online ISBN: 978-3-031-43987-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Learning Robust Classifier for Imbalanced Medical Image Dataset with Noisy Labels by Minimizing Invariant Risk