Case-based repeatability of machine learning classification performance on breast MRI

Michael Vieceli; Amy Van Dusen; Karen Drukker; Hiroyuki Abe; Maryellen L. Giger; Heather M. Whitney

doi:10.1117/12.2548144

16 March 2020 Case-based repeatability of machine learning classification performance on breast MRI

Michael Vieceli, Amy Van Dusen, Karen Drukker, Hiroyuki Abe, Maryellen L. Giger, Heather M. Whitney

Proceedings Volume 11314, Medical Imaging 2020: Computer-Aided Diagnosis; 1131421 (2020) https://doi.org/10.1117/12.2548144
Event: SPIE Medical Imaging, 2020, Houston, Texas, United States

Abstract

Computer-aided diagnosis and radiomics have shown potential in diagnosis and prognosis of breast cancer. The purpose of this study was to investigate repeatability of classifier output and its relationship to classification performance of breast lesions imaged with dynamic contrast-enhanced MRI. Images of 1,169 breast lesions (267 benign, 902 cancers) were retrospectively collected under HIPAA/IRB compliance. The lesions were segmented automatically using a fuzzy c-means method and thirty-eight radiomic features were extracted. Three classification tasks were investigated, with different proportions of cases in each class: (i) benign (23%) vs. malignant (77%), (ii) “pure” ductal carcinoma in situ (DCIS) (25%) vs. DCIS with invasive ductal carcinoma (IDC) (75%), and (iii) invasive cancers of molecular subtype luminal A or luminal B (66%) vs. other molecular subtypes (34%). For each task, support vector machine classifiers were trained and tested within 0.632+ bootstrap analyses (1000 iterations) and the 0.632+ bias-corrected area under the ROC curve (AUC) served as the classification performance metric. Repeatability of classifier output was evaluated at three levels: a) repeatability by case (performance metric: width of the 95% confidence interval of classifier-estimated posterior probabilities for each case), b) repeatability within the dataset (performance metric: median and 95% confidence interval of the by-case 95% confidence interval widths), and c) potential relationship between classification performance and repeatability. In classification performance assessment, median AUCs [95% confidence interval] for the three tasks were 0.85 [0.83, 0.87], 0.84 [0.80, 0.87], and 0.65 [0.60, 0.69], respectively. In repeatability assessment within the dataset, the median confidence interval widths [95% confidence interval] for the posterior probabilities were 0.25 [0.08, 0.72], 0.34 [0.14, 0.84], and 0.23 [0.14, 0.68]. In conclusion, the classifiers in the first two tasks demonstrated strong classification performance while in all three they showed similar repeatability in posterior probabilities.

Citation Download Citation

Michael Vieceli, Amy Van Dusen, Karen Drukker, Hiroyuki Abe, Maryellen L. Giger, and Heather M. Whitney "Case-based repeatability of machine learning classification performance on breast MRI", Proc. SPIE 11314, Medical Imaging 2020: Computer-Aided Diagnosis, 1131421 (16 March 2020); https://doi.org/10.1117/12.2548144

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available