Abstract
The goal of this study is to train and assess the performance of a deep 3D convolutional network (3D-CNN) in classifying indeterminate lung nodules as either benign or malignant based solely on diagnostic-grade thoracic CT imaging. While prior studies have relied upon subjective ratings of malignancy by radiologists, our study relies only on data from subjects with biopsy-proven ground truth labels. Our dataset includes 796 patients who underwent CT-guided lung biopsy at one institution between 2012 and 2017. All patients have pathology-confirmed diagnosis (from CT-guided biopsy) and high-resolution CT imaging data acquired immediately prior to biopsy. Lesion location was manually determined using the biopsy guidance CT scan as a reference for a subset of 86 patients for this proof-of-concept study. Rather than training the network without a priori knowledge, which risks over fitting on small datasets, we employed transfer learning, taking the initial layers of our network from an existing neural network trained on a distinct but similar dataset. We then evaluated our network on a held out test set, achieving an area under the receiver operating characteristic curve (AUC) of 0.70 and a classification accuracy of 71%.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
1.1 Lung Cancer
Lung cancer is the leading cancer-related cause of death in the US and despite significant advances in treatment options, the five-year survival rate remains low at 18.6% [7]. This is partially explained by the fact that lung cancer has often progressed to an advanced stage by the time it becomes clinically noticeable to most patients. Multiple randomized studies investigating the survival benefit of lung cancer screening have been performed. Screening with plain chest radiographs and sputum cytology has proven questionably useful, but the pivotal National Lung Screening Trial demonstrated a significant survival benefit of screening with low dose spiral computed tomography (CT) in high-risk patients. This study spurred a change in the United States Preventive Services Task Force screening guidelines, which now recommends annual low dose CT scans for patients with a high risk of lung cancer. Under these guidelines, 9 million Americans will receive a screening CT each year, resulting in 2.2 million positive test results [8], which will all require further evaluation. After a positive screening CT, the next step is a diagnostic, full resolution CT scan. If the radiologist finds this CT scan to be concerning, a CT-guided biopsy is performed.
1.2 Lung Biopsies
CT-guided lung biopsies are used to diagnose malignancy with pathologic certainty in patients with suspected lung cancer. Tissue is needed to confirm a diagnosis of lung cancer as well as to guide the application of targeted therapies. Although often necessary, CT-guided lung biopsies carry a risk of possible complications. Complications include pneumothorax, hemoptysis, pain, air embolism, and even death in rare cases. If lung nodules could be accurately classified into malignant vs. benign using exclusively non-invasive imaging data, biopsies could be avoided, sparing patients without malignancies the risk and cost of the biopsy.
1.3 Machine Learning in Lung Cancer Diagnosis
For nearly 30 years, physicians have sought to enhance their ability to accurately classify pulmonary nodules using predictive models [2]. In a study published in 1993, a Bayesian classifier was trained to classify solitary pulmonary nodules as benign or malignant based on hand-extracted clinical and radiographic features [4, 5]. Even with a dataset of limited size, the investigators were able to train a model with an area under the curve (AUC) of 0.71. Various studies have also employed radiomics approaches, using handcrafted imaging features such as texture and entropy to predict for malignancy, with one such study achieving an AUC of 0.79 [6].
Recent advances in the field of computer vision, in particular the popularization of the convolutional neural network (CNN), have resulted in corresponding advances in the field of automated medical image analysis. In the field of lung nodule analysis specifically, multiple groups have presented impressive results. Chon et al. demonstrated that a deep learning approach could accurately detect pulmonary nodules from CT images employing a U-net architecture [1]. The interest pulmonary nodule classification peaked in 2017 when the popular “Kaggle Data Science Bowl" was focused on this topic. A public dataset from the Lung Image Data Consortium (LIDC) was made available and many groups submitted impressive solutions leveraging a variety of network architectures including U-net, AlexNet, and others [10].
All of these studies, however, relied on datasets without true ground truth labels. That is, the label of malignancy vs. non malignancy was based on subjective ratings of by radiologists, not on subjects with biopsy confirmed disease. Because of this, all of these models are limited in performance to what an actual radiologist can achieve today. A natural extension of these methods would be application to a dataset with ground truth labels provided by biopsies.
1.4 Transfer Learning
Because a large open dataset containing both ground truth pathology data and CT imaging data does not yet exist, we turned to transfer learning as a possible solution [9]. In transfer learning, a network is trained on one dataset and then fine-tuned using another dataset. In our case, we chose to train a network using an open dataset from the Lung Image Data Consortium of subjects with and without lung nodules containing CT imaging data along with subjective radiologists ratings of suspicious nodules. We then fine-tuned this network to predict pathologically-confirmed lung cancer, using a smaller dataset of patients with pathologically confirmed lung cancer diagnoses.
2 Methods
2.1 Dataset
Our dataset consists of 796 patients who underwent CT-guided lung biopsy at one institution between 2012 and 2017 to evaluate suspicious pulmonary nodules. All patients had pathology-confirmed diagnosis (from CT-guided biopsy) and high-resolution CT imaging data acquired immediately prior to biopsy. Lesion location was manually determined using the biopsy guidance CT scan as a reference for a subset of 86 patients for this proof of concept study. The median nodule size was 2.1Â cm.
A random selection of 65 patients was used as the training set. The remaining 21 patients were reserved solely for testing and performance assessment.
In the training set, 72% of subjects had biopsies showing malignancy, while 28% of patients were shown to have benign disease. In the testing set, 72% were malignant, while 28% were benign.
Given the 3D location of nodule, the images were re-sampled to 1 mm\(\,\times \,\)1 mm\(\,\times \,\)1 mm resolution and a 64\(\,\times \,\)64\(\,\times \,\)64 cube volume around the center of nodule was then cropped. Visual inspection on the cropped volume was performed to ensure the inclusion of the full nodule. Examples of cropped volumes are shown in Fig. 1 below.
2.2 Network Construction
Rather than training the network without a priori knowledge, which risks over fitting on small datasets, we employed transfer learning, taking the initial layers of our network from an existing neural network trained on a distinct but similar dataset. In our case, we first identified an existing 3D-CNN to identify nodules from thoracic CT data that was trained on an open dataset from the LIDC [3]. Using the final batch pooling layer from the existing network, we then added three new untrained layers (spatial pooling, dense, and softmax) and re-trained the network using our training set employing dropout and batch normalization. This final 3D-CNN network was evaluated using the test set. The network schematic is shown in Fig. 2.
To reduce the model’s parameters, we applied a simple weighted spatial pooling of the pretrained feature vector. Next, a voxel-wise importance map is regressed out with a conv/relu/bn/sigmoid sandwich and is used to weighted average the feature vector spatially, resulting in a single feature vector of 128 features. This layer is followed by a dense layer, a softmax classification layer and standard binary cross entropy loss. Our dataset contains much more malignant cases than benign. To counter class imbalance, class weight of 10:1 (benign to malignant) weight is added to the loss function. We use a standard Adam optimizer with learning rate 1e−3. Two drop out layers with keep rate 0.8/drop rate 0.2 are added to the network to counter overfitting. We use batch size 10 and run the training for 2000 steps. Empirically this is enough for the network to reach convergence.
4 Conclusion
Machine learning based image analysis methods have the potential to significantly enhance radiology workflows, reduce the occurrence of missed diagnoses and false positives, and improve survival rates for lung cancer patients. However, creation of larger, more comprehensive medical image datasets is required before clinically acceptable models can be trained. In this proof of concept study, we demonstrate that a network trained on a publicly available dataset can be fine-tuned, even with a small number of subjects, to a more specific classification task. Although the performance of our model does not reach state-of-the-art in terms of classification accuracy or AUC, we believe that the inclusion of ground truth labels based on pathology is novel and an important step towards clinical adoption of lung cancer CAD software. We are currently extracting additional imaging and pathology data from our larger dataset, and a more complete analysis on the full 796 patients is planned in the near future. In this study, we hope that the larger sample size will allow us to further fine-tune our existing network and allow for meaningful gains in accuracy and AUC. Ultimately, single institution datasets will not lead to optimal classifier performance. Given the importance of diagnosing lung cancer at an early stage and the government’s new screening guidelines, we strongly advise the medical community to begin construction of a comprehensive open dataset consisting of pathology and imaging data. Furthermore, radiologists don’t makes clinical diagnoses using solely on imaging data. They often correlate their imaging finding with additional clinical data such as the patient’s smoking history, demographics, and co-morbid conditions. We hope to expand our dataset to include additional clinical features from the patient’s electronic medical record, with a goal of creating a workflow-integrated CAD solution for lung cancer screening.
References
Chon, A., et al.: Deep convolutional neural networks for lung cancer detection. Technical report, Stanford University (2017)
Deppen, S.A., et al.: Predicting lung cancer prior to surgical resection in patients with lung nodules. J. Thorac. Oncol. 9(10), 1477–1484 (2014)
Foucard, L.: Github Repository (2017). https://github.com/LouisFoucard/DSB17
Gurney, J.W.: Determining the likelihood of malignancy in solitary pulmonary nodules with Bayesian analysis. Part II. Application. Radiology 186(2), 415–22 (1993)
Gurney, J.W., Lyddon, D.M., McKay, J.A.: Determining the likelihood of malignancy in solitary pulmonary nodules with Bayesian Analysis. Part II. Application. Radiology 186(2), 415–422 (1993)
Hawkins, S., et al.: Predicting malignant nodules from screening CT scans. J. Thorac. Oncol. 11(12), 2120–2128 (2016)
Surveillance, Epidemiology, and End Results (SEER) Program (2008–2014). www.seer.cancer.gov
Lokhandwala, T., et al.: Costs of diagnostic assessment for lung cancer: a medicare claims analysis. Clin. Lung Cancer 18(1), e27–34 (2017). https://doi.org/10.1016/j.cllc.2016.07.006
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, NIPS 2014, vol. 27. NIPS Foundation (2014)
Zhao, X., Liu, L., Qi, S., Teng, Y., Li, J., Qian, W.: Agile convolutional neural network for pulmonary nodule classification using CT images. Int. J. Comput. Assist. Radiol. Surg. 13(4), 585–95 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Lindsay, W., Wang, J., Sachs, N., Barbosa, E., Gee, J. (2018). Transfer Learning Approach to Predict Biopsy-Confirmed Malignancy of Lung Nodules from Imaging Data: A Pilot Study. In: Stoyanov, D., et al. Image Analysis for Moving Organ, Breast, and Thoracic Images. RAMBO BIA TIA 2018 2018 2018. Lecture Notes in Computer Science(), vol 11040. Springer, Cham. https://doi.org/10.1007/978-3-030-00946-5_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-00946-5_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00945-8
Online ISBN: 978-3-030-00946-5
eBook Packages: Computer ScienceComputer Science (R0)