An evaluation of consensus techniques for diagnostic interpretation

Jake N. Sauter; Victoria M. LaBarre; Jacob D. Furst; Daniela S. Raicu

doi:10.1117/12.2293778

27 February 2018 An evaluation of consensus techniques for diagnostic interpretation

Jake N. Sauter, Victoria M. LaBarre, Jacob D. Furst, Daniela S. Raicu

Proceedings Volume 10575, Medical Imaging 2018: Computer-Aided Diagnosis; 1057538 (2018) https://doi.org/10.1117/12.2293778
Event: SPIE Medical Imaging, 2018, Houston, Texas, United States

Abstract

Learning diagnostic labels from image content has been the standard in computer-aided diagnosis. Most computer-aided diagnosis systems use low-level image features extracted directly from image content to train and test machine learning classifiers for diagnostic label prediction. When the ground truth for the diagnostic labels is not available, reference truth is generated from the experts diagnostic interpretations of the image/region of interest. More specifically, when the label is uncertain, e.g. when multiple experts label an image and their interpretations are different, techniques to handle the label variability are necessary. In this paper, we compare three consensus techniques that are typically used to encode the variability in the experts labeling of the medical data: mean, median and mode, and their effects on simple classifiers that can handle deterministic labels (decision trees) and probabilistic vectors of labels (belief decision trees). Given that the NIH/NCI Lung Image Database Consortium (LIDC) data provides interpretations for lung nodules by up to four radiologists, we leverage the LIDC data to evaluate and compare these consensus approaches when creating computer-aided diagnosis systems for lung nodules. First, low-level image features of nodules are extracted and paired with their radiologists semantic ratings (1= most likely benign, , 5 = most likely malignant); second, machine learning multi-class classifiers that handle deterministic labels (decision trees) and probabilistic vectors of labels (belief decision trees) are built to predict the lung nodules semantic ratings. We show that the mean-based consensus generates the most robust classi- fier overall when compared to the median- and mode-based consensus. Lastly, the results of this study show that, when building CAD systems with uncertain diagnostic interpretation, it is important to evaluate different strategies for encoding and predicting the diagnostic label.

Citation Download Citation

Jake N. Sauter, Victoria M. LaBarre, Jacob D. Furst, and Daniela S. Raicu "An evaluation of consensus techniques for diagnostic interpretation", Proc. SPIE 10575, Medical Imaging 2018: Computer-Aided Diagnosis, 1057538 (27 February 2018); https://doi.org/10.1117/12.2293778

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
10 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Lung

Computer aided diagnosis and therapy

Diagnostics

Binary data

Feature extraction

Lung cancer

Databases

Show All Keywords

Keywords/Phrases

Search In:

Publication Years