Abstract
Purpose
Machine learning (ML) algorithms are well known to exhibit variations in prediction accuracy when provided with imbalanced training sets typically seen in medical imaging (MI) due to the imbalanced ratio of pathological and normal cases. This paper presents a thorough investigation of the effects of class imbalance and methods for mitigating class imbalance in ML algorithms applied to MI.
Methods
We first selected five classes from the Image Retrieval in Medical Applications (IRMA) dataset, performed multiclass classification using the random forest model (RFM), and then performed binary classification using convolutional neural network (CNN) on a chest X-ray dataset. An imbalanced class was created in the training set by varying the number of images in that class. Methods tested to mitigate class imbalance included oversampling, undersampling, and changing class weights of the RFM. Model performance was assessed by overall classification accuracy, overall F1 score, and specificity, recall, and precision of the imbalanced class.
Results
A close-to-balanced training set resulted in the best model performance, and a large imbalance with overrepresentation was more detrimental to model performance than underrepresentation. Oversampling and undersampling methods were both effective in mitigating class imbalance, and efficacy of oversampling techniques was class specific.
Conclusion
This study systematically demonstrates the effect of class imbalance on two public X-ray datasets on RFM and CNN, making these findings widely applicable as a reference. Furthermore, the methods employed here can guide researchers in assessing and addressing the effects of class imbalance, while considering the data-specific characteristics to optimize imbalance mitigating methods.



Similar content being viewed by others
Availability of data and material
Datasets used in this study include an X-ray image dataset (IRMA), a handwritten digits image dataset (MNIST), and a chest X-Ray binary dataset consisting of pneumonia and normal images. All are publicly available.
References
Chan S, Siegel EL (2019) Will machine learning end the viability of radiology as a thriving medical specialty? Br J Radiol 92:20180416. https://doi.org/10.1259/bjr.20180416
Chen C, Liaw A, Brieman L (2004) Using random forest to learn imbalanced data: Technical Report No. 666. University of California, Berkley. Using Random Forest to Learn Imbalanced Data
Barandela R, Sánchez JS, García V, Rangel E (2003) Strategies for learning in class imbalance problems. Pattern Recognit 36:849–851. https://doi.org/10.1016/S0031-3203(02)00257-1
Klement W, Wilk S, Michalowski W, Matwin S (2011) Classifying severely imbalanced data
Tang A, Tam R, Cadrin-Chênevert A, Guest W, Chong J, Barfett J, Chepelev L, Cairns R, Mitchell JR, Cicero MD, Poudrette MG, Jaremko JL, Reinhold C, Gallix B, Gray B, Geis R, O’Connell T, Babyn P, Koff D, Ferguson D, Derkatch S, Bilbily A, Shabana W (2018) Canadian association of radiologists white paper on artificial intelligence in radiology. Can Assoc Radiol J 69:120–135
Balki I, Amirabadi A, Levman J, Martel AL, Emersic Z, Meden B, Garcia-Pedrero A, Ramirez SC, Kong D, Moody AR, Tyrrell PN (2019) Sample-size determination methodologies for machine learning in medical imaging research: a systematic review. Can Assoc Radiol J 70:344–353
Lehmann T, Fischer B, Güld M, Thies C, Keysers D, Deselaers T, Schubert H, Wein B, Spitzer K (2004) The IRMA reference database and its use for content-based image retrieval in medical applications. Science 5:3–6
Kermany DS, Goldbaum M, Cai W, Valentim CCS, Liang H, Baxter SL, McKeown A, Yang G, Wu X, Yan F, Dong J, Prasadha MK, Pei J, Ting M, Zhu J, Li C, Hewett S, Dong J, Ziyar I, Shi A, Zhang R, Zheng L, Hou R, Shi W, Fu X, Duan Y, Huu VAN, Wen C, Zhang ED, Zhang CL, Li O, Wang X, Singer MA, Sun X, Xu J, Tafreshi A, Lewis MA, Xia H, Zhang K (2018) Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172:1122–1131.e9. https://doi.org/10.1016/j.cell.2018.02.010
Cui Z, Gong G (2018) The effect of machine learning regression algorithms and sample size on individualized behavioral prediction with functional connectivity features. Neuroimage 178:622–637. https://doi.org/10.1016/j.neuroimage.2018.06.001
Cohen O, Zhu B, Rosen MS (2018) MR fingerprinting deep reconstruction network (DRONE). Magn Reson Med 80:885–894. https://doi.org/10.1002/mrm.27198
Blagus R, Lusa L (2010) Class prediction for high-dimensional class-imbalanced data. BMC Bioinformatics 11:523. https://doi.org/10.1186/1471-2105-11-523
Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intell Data Anal 6:429–449. https://doi.org/10.3233/ida-2002-6504
Park SH, Han K (2018) Methodologic guide for evaluating clinical performance and effect of artificial intelligence technology for medical diagnosis and prediction. Radiology 286:800–809. https://doi.org/10.1148/radiol.2017171920
Chawla NV (2006) Data mining for imbalanced datasets: an overview. In: Data mining and knowledge discovery handbook. Springer, pp 853–867
Abd Elrahman SM, Abraham A (2013) A review of class imbalance problem. J Netw Innov Comput 1:332–340
Hitzl W, Reitsamer HA, Hornykewycz K, Mistlberger A, Grabner G (2003) Application of discriminant, classification tree and neural network analysis to differentiate between potential glaucoma suspects with and without visual field defects. J Theor Med 5:161–170. https://doi.org/10.1080/10273360410001728011
Funding
No funding was received for this work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors have declared that they have no conflict of interest.
Code availability
All code was custom using Python 3.7.3 and included in Supplementary Materials (Online Resources 2–9 and 11).
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Qu, W., Balki, I., Mendez, M. et al. Assessing and mitigating the effects of class imbalance in machine learning with application to X-ray imaging. Int J CARS 15, 2041–2048 (2020). https://doi.org/10.1007/s11548-020-02260-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11548-020-02260-6