Semi-supervised adversarial model for benign–malignant lung nodule classification on chest CT
Graphical abstract
Introduction
Lung cancer is the leading cause of cancer death (Torre et al., 2016). The 5-year survival rate for patients with advanced stage IV lung cancer is less than 5%, but it is at least 60% if the diagnosis is made early when the primary tumor is small and asymptomatic (Wu and Raz, 2016). Early lung cancer detection and effective treatment therefore offers the best chance for cure (Wang et al., 2017). The National Lung Screening Trial shows that screening with computed tomography (CT) results in a 20% reduction in lung cancer deaths through the identification of early disease (Wu, Raz, 2016, Bach, Mirkin, Oliver, 2012). A spot on the lung on a chest CT is defined as a lung nodule, and it can be benign or malignant (Slatore et al., 2016). Most lung cancers arise from small malignant nodules. Radiologists typically read chest CT scans for malignant nodules on a slice-by-slice basis, and such an approach is time-consuming, expensive and can be prone to operator bias. Computer-aided diagnosis (CAD) systems avoid many of these issues and have been employed to assist radiologists in reading chest CT scans.
Most current CAD systems focus on extracting handcrafted (Han, Wang, Zhang, Han, Song, Li, Moore, Lu, Zhao, Liang, 2015, Dhara, Mukhopadhyay, Dutta, Garg, Khandelwal, 2016, Alilou, Orooji, Madabhushi, 2017), learned (Shen et al., 2015), or combined nodule features (Xie, Zhang, Xia, Fulham, Zhang, 2018b, Xie, Zhang, Liu, Cai, Xia, 2017b, Buty, Xu, Gao, Bagci, Wu, Mollura, 2016), and then training a feature classifier such as the support vector machine (SVM) (Cortes and Vapnik, 1995), back propagation neural network (BPNN) (Rojas, 1996, Hecht-Nielsen, 1992, Zhang, Xia, Xie, Fulham, Feng, 2018) and random forest (Buty, Xu, Gao, Bagci, Wu, Mollura, 2016, Breiman, 2001). Recently, deep convolutional neural networks (DCNNs) have achieved great success in many image classification tasks (Litjens et al., 2017), since they offer a unified end-to-end solution for feature extraction and classifier construction and free users from the troublesome handcrafted feature extraction (Xie, Xia, Zhang, Feng, Fulham, Cai, 2017a, Shen, Zhou, Yang, Yu, Dong, Yang, Zang, Tian, 2017, Hussein, Cao, Song, Bagci, 2017a, Sakamoto, Nakano, Zhao, Sekiyama, 2018, Dey, Lu, Hong, 2018). Although being more accurate than handcrafted features-based methods, DCNNs have not achieved the same performance on routine lung nodule classification as they have done in the ImageNet challenge. The suboptimal performance is attributed mainly to the fact that a DCNN may over-fit the lung nodule data, which is far from adequate to train a deep learning model.
In fact, an essential challenge in most deep learning-based medical image analysis tasks is the issue of small data, that relates to the work required in acquiring the image data and then in image annotation. Many research efforts have been devoted to address issue, including data augmentation, deep ensemble learning (Jia et al., 2018), combining traditional shallow models with deep ones (Zhang, Xia, Xie, Fulham, Feng, 2018, Xie, Zhang, Xia, Fulham, Zhang, 2018b), incorporating domain knowledge into the deep model (Xie et al., 2017a), and extracting patches on multiple planar views (Xie, Xia, Zhang, Song, Feng, Fulham, Cai, 2018a, Setio, Ciompi, Litjens, Gerke, Jacobs, van Riel, Wille, Naqibullah, Snchez, van Ginneken, 2016). Although achieved improved performance, these methods still rely on the number of training data.
Since medical image annotation requires a high degree of skill and concentration and is not always available, semi-supervised learning (SSL) has been adopted to enable us to use both labeled and unlabeled data for model training. The deep learning community has explored a large variety of SSL techniques (Cheng, Zhao, Cai, Li, Huang, Rui, 2016, Hinton, Salakhutdinov, 2006, Ranzato, Szummer, 2008, Springenberg, Radford, Metz, Chintala, Makhzani, Shlens, Jaitly, Goodfellow, Frey, Rasmus, Berglund, Honkala, Valpola, Raiko, 2015, Zhang, Yang, Chen, Fredericksen, Hughes, Chen, 2017, Baur, Albarqouni, Navab, 2017, Haeusser, Mordvintsev, Cremers, 2017). As a typical example, a deep auto-encoder (DAE) trained with unlabeled data can be altered into a classifier via replacing the decoder part with fully connected layers and fine-tuning with labeled data. However, DAE is a generative model, which learns the image representation that is suitable for reconstruction, but may not be suitable for discrimination. Sharing the parameters between the encoder part of DAE and the feature extraction part of a classification network may lead to limited discriminatory power (Rasmus et al., 2015). Therefore, we suggest jointly using a generative model trained with both labeled and unlabeled data and a discriminative model trained with only labeled data for semi-supervised medical image classification.
In this paper, we propose the semi-supervised adversarial classification (SSAC) model which can be trained by jointly using unlabeled and labeled data in a non-parameter-sharing manner. This model is composed of an adversarial autoencoder-based unsupervised reconstruction network R, a supervised classification network C, and learnable transition layers (T layers), which enable the adaption of the image representation ability learned by R to C. The SSAC model has been further extended to the multi-view knowledge-based collaborative (MV-KBC) learning (Xie et al., 2018a), denoted by MK-SSAC model, for benign–malignant lung nodule classification using chest CT. The proposed MK-SSAC model consists of 27 SSAC submodels, each characterizing a nodule’s overall appearance (OA), heterogeneity in voxel values (HVV) and heterogeneity in shapes (HS) from each of sagittal, coronal, axial, and six diagonal planar views, respectively.
DCNN-based nodule classification. The success of DCNNs on several popular image classification benchmarks such as the ImageNet database has prompted many investigators to apply DCNNs to benign–malignant lung nodule classification. Hua et al. (2015) applied a DCNN and deep belief network (DBN) to separate benign lung nodules from malignant ones and reported that deep learning achieved better discrimination than traditional methods. Shen et al. (2017) proposed a multi-crop CNN to extract nodule salient information by cropping different regions from convolutional feature maps and then applying max-pooling multiple times. Hussein et al. (2017a) combined a 3D DCNN with graph regularized sparse multi-task learning to stratify the malignancy of lung nodules.
Small data problem in medical image classification. There are many attempts to address the issue of small data in medical image classification. First, although it is straightforward to design 3D DCNN (Hussein, Cao, Song, Bagci, 2017a, Dou, Chen, Yu, Qin, Heng, 2017b, Dou, Yu, Chen, Jin, Yang, Qin, Heng, 2017c, Li, Dou, Chen, Fu, Qi, Belav, Armbrecht, Felsenberg, Zheng, Heng, 2018, Dou, Chen, Jin, Lin, Qin, Heng, 2017a, Yan, Pang, Qi, Zhu, Bai, Geng, Liu, Terzopoulos, Ding, 2017), extending the use of 2D DCNN to the analysis of volumetric medical images on a slice-by-slice basis, together with data augmentation (Shen, Zhou, Yang, Yu, Dong, Yang, Zang, Tian, 2017, Setio, Ciompi, Litjens, Gerke, Jacobs, van Riel, Wille, Naqibullah, Snchez, van Ginneken, 2016, Hussein, Gillies, Cao, Song, Bagci, 2017b, Vigneault, Xie, Ho, Bluemke, Noble, 2018), enables us to have more training samples. Second, the prior domain knowledge, such as there is a high correspondence between a nodule’s malignancy and its heterogeneity (see Fig. 1) (Xie, Xia, Zhang, Feng, Fulham, Cai, 2017a, Xie, Xia, Zhang, Song, Feng, Fulham, Cai, 2018a, Metz, Ganter, Lorenzen, van Marwick, Holzapfel, Herrmann, Rummeny, Wester, Schwaiger, Nekolla, Beer, 2015), can be used to regularize deep models. In our previous work (Xie et al., 2018a), we proposed the MV-KBC learning model to separate malignant nodules from benign ones using limited chest CT data. We decomposed a 3D nodule into nine fixed views and constructed, for each view, a knowledge-based collaborative (KBC) submodel, which consists of three pre-trained ResNet-50 networks. We designed three types of image patches to fine-tune those three ResNet-50 in each KBC submodel, enabling them to characterize the nodule’s OA, HVV, and HS, respectively. We jointly used nine KBC submodels to classify lung nodules with an adaptive weighting scheme learned during the error back propagation, which enables us to train the MV-KBC model in an end-to-end manner.
Semi-supervised deep learning. SSL methods, which leverage unsupervised learning with unlabeled data to support the learning of supervised model, enable us to train deep models using both labeled and unlabeled data, and hence reduce the work related to image annotation (Chapelle, Scholkopf, A. Zien, 2009, Zhu, 2006). There are different choices for unsupervised learning models. Ranzato and Szummer (2008) used an unsupervised autoencoder to reconstruct the input, and shared parameters between the encoder and a classification network. Goodfellow et al. (2014) introduced an unsupervised generative adversarial network (GAN) which consists of two adversarial models: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample comes from the training data rather than G. Considering GAN as unsupervised learning model, researchers have developed a series of SSL methods. Makhzani et al. (2015) turned an autoencoder into the adversarial autoencoder (AAE) and exploited the encoder to predict the discrete class label. Springenberg (2015) modified the objective function of the discriminator, and thus proposed the categorical GAN (CatGAN), which takes into account the mutual information between observed examples and their predicted class distribution. Rasmus et al. (2015) proposed the ladder network to alleviate the adaptation problem between unsupervised generative and supervised discriminative models. This network has an auxiliary supervised output in the encoder part and skip connections from the encoder to decoder, aiming to reduce the burden on the encoding layers during unsupervised learning. Despite improved accuracy, this model still uses shared parameters in encoder part for both generative and discriminative tasks.
The main contributions of this work include: (a) the proposed SSAC model uses learnable T layers to transfer the representation ability learned by the reconstruction network R to the classification network C, abandoning the parameter sharing and feature adaption strategy, (b) the adversarial training is used in R to minimize the discrepancy between the distributions of input lung nodules and reconstructed one, and (c) the extended MK-SSAC model has been evaluated on the LIDC-IDRI dataset and achieved the state-of-the-art performance in benign–malignant lung nodule classification.
Section snippets
Experimental dataset
The LIDC-IDRI dataset (Armato, McLennan, Bidaut, McNitt-Gray, Meyer, Reeves, Zhao, Aberle, Henschke, Hoffman, Kazerooni, MacMahon, Beek, Yankelevitz, Biancardi, Bland, Brown, Engelmann, Laderach, Max, Pais, Qing, Roberts, Smith, Starkey, Batra, Caligiuri, Farooqi, Gladish, Jude, Munden, Petkovska, Quint, Schwartz, Sundaram, Dodd, Fenimore, Gur, Petrick, Freymann, Kirby, Hughes, Vande Casteele, Gupte, Sallam, Heath, Kuhn, Dharaiya, Burns, Fryd, Salganicoff, Anand, Shreter, Vastagh, Croft,
Methodology
The proposed SSAC model consists of three major modules: an adversarial autoencoder-based unsupervised reconstruction network R, a supervised classification network C, and learnable T layers (see Fig. 2). This model has been extended to the MK-SSAC model (see Fig. 4(b)) for benign–malignant lung nodule classification, which contains 27 SSAC submodels, each characterizing a nodule’s OA, HVV, and HS from each of three orthographic and six diagonal views, respectively.
Experimental design
We applied the proposed MK-SSAC model to the LIDC-IDRI dataset five times independently, with the 10-fold cross validation. The Tianchi dataset that includes 1839 unlabeled nodules was used to train each MK-SSAC model. It took about 24 h to train the MK-SSAC model and less than 0.5 s to use it to classify each nodule (Intel Xeon E5- 2640 V4 CPU, 4 NVIDIA Titan X GPU, 512GB RAM, Keras and Tensorflow). The training process is time consuming, but can be performed offline. The very fast testing
Comparison to other SSL methods
The performance of the proposed MK-SSAC model and three SSL models was compared in Table 2, which shows that the MK-SSAC model achieved the highest accuracy, sensitivity, and AUC and the second highest specificity. Especially, the significant improvement in sensitivity (minimum 1.87%) is substantial. Since a higher sensitivity indicates a lower false negative rate, our model is more suitable for lung nodule screening and potentially more useful in clinical practice than other three SSL models.
Ablation studies
The contribution of the T layers that bridge the reconstruction network and classification network has been illustrated in the first experiment. Besides T layers, the reconstruction network itself and the adversarial learning used in it play a pivotal role in the proposed MK-SSAC model. To demonstrate the contributions of these two modules, we conducted ablation studies via constructing the MK-C model and MK-SSGC model. In MK-C, each of 27 submodels contains only the classification network
Conclusion
We presented the MK-SSAC model to differentiate malignant lung nodules from benign ones on chest CT by proposing a novel semi-supervised strategy to effectively use the unlabeled nodules and taking into account the nodule’s heterogeneity in shape and voxel values on nine planar views. Experimental results on the LIDC-IDRI dataset demonstrate the effectiveness of our MK-SSAC model for improving the state-of-the-art nodule classification systems. Although our model is built upon the specific
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China under Grants 61771397, in part by the Science and Technology Innovation Committee of Shenzhen Municipality, China, under Grants JCYJ20180306171334997, in part by Synergy Innovation Foundation of the University and Enterprise for Graduate Students in Northwestern Polytechnical University under Grants XQ201911, and in part by the Project for Graduate Innovation team of Northwestern Polytechnical University. We
References (64)
- et al.
Generative adversarial nets
- Kingma, D. P., Ba, J., 2014. Adam: a method for stochastic optimization....
- et al.
A survey on deep learning in medical image analysis
Med. Image Anal.
(2017) - et al.
Visualizing data using t-SNE
J. Mach. Learn. Res.
(2008) - Springenberg, J. T., 2015. Unsupervised and semi-supervised learning with categorical generative adversarial networks....
- et al.
Lung Cancer Screening
(2016) - et al.
Transferable multi-model ensemble for benign-malignant lung nodule classification on chest ct
Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)
(2017) - et al.
Classification of medical images in the biomedical literature by jointly using deep and handcrafted visual features
IEEE J. Biomed. Health Inform.
(2018) - et al.
Intra-perinodular textural transition (IPRIS): a 3D descriptor for nodule diagnosis on lung ct
Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)
(2017) - et al.
The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on ct scans
Med. Phys.
(2011)
Lungx challenge for computerized lung nodule classification: reflections and lessons learned
J. Med. Imaging
Benefits and harms of CT screening for lung cancer: a systematic review
JAMA
Semi-supervised deep learning for fully convolutional networks
Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)
Random forests
Mach. Learn.
Characterization of lung nodule malignancy using hybrid shape and appearance features
Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)
Semi-supervised learning
IEEE Trans. Neural Netw.
Semi-supervised multimodal deep learning for RGB-D object recognition
Proc. of the 25th International Joint Conference on Artificial Intelligence (IJCAI)
The cancer imaging archive (TCIA): maintaining and operating a public information repository
J. Digit. Imaging
Skin lesion analysis toward melanoma detection: a challenge at the 2017 international symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC)
2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018)
Support-vector networks
Mach. Learn.
Imagenet: a large-scale hierarchical image database
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Diagnostic classification of lung nodules using 3D neural networks
IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018)
A combination of shape and texture features for classification of pulmonary nodules in lung ct images
J. Digit. Imaging
Automated pulmonary nodule detection via 3D convnets with online sample filtering and hybrid-loss residual learning
Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)
Multilevel contextual 3-D CNNs for false positive reduction in pulmonary nodule detection
IEEE Trans. Biomed. Eng.
3D deeply supervised network for automated segmentation of volumetric medical images
Med. Image Anal.
Learning by association a versatile semi-supervised training method for neural networks
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Texture feature analysis for computer-aided diagnosis on pulmonary nodules
J. Digit. Imaging
Deep residual learning for image recognition
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Cited by (148)
A domain knowledge powered hybrid regularization strategy for semi-supervised breast cancer diagnosis
2024, Expert Systems with ApplicationsTowards reliable and explainable AI model for pulmonary nodule diagnosis
2024, Biomedical Signal Processing and ControlAdaptive feature aggregation based multi-task learning for uncertainty-guided semi-supervised medical image segmentation
2023, Expert Systems with ApplicationsEffectively fusing clinical knowledge and AI knowledge for reliable lung nodule diagnosis
2023, Expert Systems with ApplicationsQualitative analysis of PD-L1 expression in non-small-cell lung cancer based on chest CT radiomics
2023, Biomedical Signal Processing and Control