Elsevier

Medical Image Analysis

Volume 57, October 2019, Pages 237-248
Medical Image Analysis

Semi-supervised adversarial model for benign–malignant lung nodule classification on chest CT

https://doi.org/10.1016/j.media.2019.07.004Get rights and content

Highlights

  • Propose the semi-supervised adversarial classification (SSAC) model without using parameter sharing.

  • Use learnable layers to transfer image representation from the reconstruction network to the classification network.

  • Use adversarial learning in the reconstruction network for performance gain.

  • Extend SSAC to MK-SSAC for lung nodule classification and achieved state-of-the-art performance.

Abstract

Classification of benign–malignant lung nodules on chest CT is the most critical step in the early detection of lung cancer and prolongation of patient survival. Despite their success in image classification, deep convolutional neural networks (DCNNs) always require a large number of labeled training data, which are not available for most medical image analysis applications due to the work required in image acquisition and particularly image annotation. In this paper, we propose a semi-supervised adversarial classification (SSAC) model that can be trained by using both labeled and unlabeled data for benign–malignant lung nodule classification. This model consists of an adversarial autoencoder-based unsupervised reconstruction network R, a supervised classification network C, and learnable transition layers that enable the adaption of the image representation ability learned by R to C. The SSAC model has been extended to the multi-view knowledge-based collaborative learning, aiming to employ three SSACs to characterize each nodule’s overall appearance, heterogeneity in shape and texture, respectively, and to perform such characterization on nine planar views. The MK-SSAC model has been evaluated on the benchmark LIDC-IDRI dataset and achieves an accuracy of 92.53% and an AUC of 95.81%, which are superior to the performance of other lung nodule classification and semi-supervised learning approaches.

Introduction

Lung cancer is the leading cause of cancer death (Torre et al., 2016). The 5-year survival rate for patients with advanced stage IV lung cancer is less than 5%, but it is at least 60% if the diagnosis is made early when the primary tumor is small and asymptomatic (Wu and Raz, 2016). Early lung cancer detection and effective treatment therefore offers the best chance for cure (Wang et al., 2017). The National Lung Screening Trial shows that screening with computed tomography (CT) results in a 20% reduction in lung cancer deaths through the identification of early disease (Wu, Raz, 2016, Bach, Mirkin, Oliver, 2012). A spot on the lung on a chest CT is defined as a lung nodule, and it can be benign or malignant (Slatore et al., 2016). Most lung cancers arise from small malignant nodules. Radiologists typically read chest CT scans for malignant nodules on a slice-by-slice basis, and such an approach is time-consuming, expensive and can be prone to operator bias. Computer-aided diagnosis (CAD) systems avoid many of these issues and have been employed to assist radiologists in reading chest CT scans.

Most current CAD systems focus on extracting handcrafted (Han, Wang, Zhang, Han, Song, Li, Moore, Lu, Zhao, Liang, 2015, Dhara, Mukhopadhyay, Dutta, Garg, Khandelwal, 2016, Alilou, Orooji, Madabhushi, 2017), learned (Shen et al., 2015), or combined nodule features (Xie, Zhang, Xia, Fulham, Zhang, 2018b, Xie, Zhang, Liu, Cai, Xia, 2017b, Buty, Xu, Gao, Bagci, Wu, Mollura, 2016), and then training a feature classifier such as the support vector machine (SVM) (Cortes and Vapnik, 1995), back propagation neural network (BPNN) (Rojas, 1996, Hecht-Nielsen, 1992, Zhang, Xia, Xie, Fulham, Feng, 2018) and random forest (Buty, Xu, Gao, Bagci, Wu, Mollura, 2016, Breiman, 2001). Recently, deep convolutional neural networks (DCNNs) have achieved great success in many image classification tasks (Litjens et al., 2017), since they offer a unified end-to-end solution for feature extraction and classifier construction and free users from the troublesome handcrafted feature extraction (Xie, Xia, Zhang, Feng, Fulham, Cai, 2017a, Shen, Zhou, Yang, Yu, Dong, Yang, Zang, Tian, 2017, Hussein, Cao, Song, Bagci, 2017a, Sakamoto, Nakano, Zhao, Sekiyama, 2018, Dey, Lu, Hong, 2018). Although being more accurate than handcrafted features-based methods, DCNNs have not achieved the same performance on routine lung nodule classification as they have done in the ImageNet challenge. The suboptimal performance is attributed mainly to the fact that a DCNN may over-fit the lung nodule data, which is far from adequate to train a deep learning model.

In fact, an essential challenge in most deep learning-based medical image analysis tasks is the issue of small data, that relates to the work required in acquiring the image data and then in image annotation. Many research efforts have been devoted to address issue, including data augmentation, deep ensemble learning (Jia et al., 2018), combining traditional shallow models with deep ones (Zhang, Xia, Xie, Fulham, Feng, 2018, Xie, Zhang, Xia, Fulham, Zhang, 2018b), incorporating domain knowledge into the deep model (Xie et al., 2017a), and extracting patches on multiple planar views (Xie, Xia, Zhang, Song, Feng, Fulham, Cai, 2018a, Setio, Ciompi, Litjens, Gerke, Jacobs, van Riel, Wille, Naqibullah, Snchez, van Ginneken, 2016). Although achieved improved performance, these methods still rely on the number of training data.

Since medical image annotation requires a high degree of skill and concentration and is not always available, semi-supervised learning (SSL) has been adopted to enable us to use both labeled and unlabeled data for model training. The deep learning community has explored a large variety of SSL techniques (Cheng, Zhao, Cai, Li, Huang, Rui, 2016, Hinton, Salakhutdinov, 2006, Ranzato, Szummer, 2008, Springenberg, Radford, Metz, Chintala, Makhzani, Shlens, Jaitly, Goodfellow, Frey, Rasmus, Berglund, Honkala, Valpola, Raiko, 2015, Zhang, Yang, Chen, Fredericksen, Hughes, Chen, 2017, Baur, Albarqouni, Navab, 2017, Haeusser, Mordvintsev, Cremers, 2017). As a typical example, a deep auto-encoder (DAE) trained with unlabeled data can be altered into a classifier via replacing the decoder part with fully connected layers and fine-tuning with labeled data. However, DAE is a generative model, which learns the image representation that is suitable for reconstruction, but may not be suitable for discrimination. Sharing the parameters between the encoder part of DAE and the feature extraction part of a classification network may lead to limited discriminatory power (Rasmus et al., 2015). Therefore, we suggest jointly using a generative model trained with both labeled and unlabeled data and a discriminative model trained with only labeled data for semi-supervised medical image classification.

In this paper, we propose the semi-supervised adversarial classification (SSAC) model which can be trained by jointly using unlabeled and labeled data in a non-parameter-sharing manner. This model is composed of an adversarial autoencoder-based unsupervised reconstruction network R, a supervised classification network C, and learnable transition layers (T layers), which enable the adaption of the image representation ability learned by R to C. The SSAC model has been further extended to the multi-view knowledge-based collaborative (MV-KBC) learning (Xie et al., 2018a), denoted by MK-SSAC model, for benign–malignant lung nodule classification using chest CT. The proposed MK-SSAC model consists of 27 SSAC submodels, each characterizing a nodule’s overall appearance (OA), heterogeneity in voxel values (HVV) and heterogeneity in shapes (HS) from each of sagittal, coronal, axial, and six diagonal planar views, respectively.

DCNN-based nodule classification. The success of DCNNs on several popular image classification benchmarks such as the ImageNet database has prompted many investigators to apply DCNNs to benign–malignant lung nodule classification. Hua et al. (2015) applied a DCNN and deep belief network (DBN) to separate benign lung nodules from malignant ones and reported that deep learning achieved better discrimination than traditional methods. Shen et al. (2017) proposed a multi-crop CNN to extract nodule salient information by cropping different regions from convolutional feature maps and then applying max-pooling multiple times. Hussein et al. (2017a) combined a 3D DCNN with graph regularized sparse multi-task learning to stratify the malignancy of lung nodules.

Small data problem in medical image classification. There are many attempts to address the issue of small data in medical image classification. First, although it is straightforward to design 3D DCNN (Hussein, Cao, Song, Bagci, 2017a, Dou, Chen, Yu, Qin, Heng, 2017b, Dou, Yu, Chen, Jin, Yang, Qin, Heng, 2017c, Li, Dou, Chen, Fu, Qi, Belav, Armbrecht, Felsenberg, Zheng, Heng, 2018, Dou, Chen, Jin, Lin, Qin, Heng, 2017a, Yan, Pang, Qi, Zhu, Bai, Geng, Liu, Terzopoulos, Ding, 2017), extending the use of 2D DCNN to the analysis of volumetric medical images on a slice-by-slice basis, together with data augmentation (Shen, Zhou, Yang, Yu, Dong, Yang, Zang, Tian, 2017, Setio, Ciompi, Litjens, Gerke, Jacobs, van Riel, Wille, Naqibullah, Snchez, van Ginneken, 2016, Hussein, Gillies, Cao, Song, Bagci, 2017b, Vigneault, Xie, Ho, Bluemke, Noble, 2018), enables us to have more training samples. Second, the prior domain knowledge, such as there is a high correspondence between a nodule’s malignancy and its heterogeneity (see Fig. 1) (Xie, Xia, Zhang, Feng, Fulham, Cai, 2017a, Xie, Xia, Zhang, Song, Feng, Fulham, Cai, 2018a, Metz, Ganter, Lorenzen, van Marwick, Holzapfel, Herrmann, Rummeny, Wester, Schwaiger, Nekolla, Beer, 2015), can be used to regularize deep models. In our previous work (Xie et al., 2018a), we proposed the MV-KBC learning model to separate malignant nodules from benign ones using limited chest CT data. We decomposed a 3D nodule into nine fixed views and constructed, for each view, a knowledge-based collaborative (KBC) submodel, which consists of three pre-trained ResNet-50 networks. We designed three types of image patches to fine-tune those three ResNet-50 in each KBC submodel, enabling them to characterize the nodule’s OA, HVV, and HS, respectively. We jointly used nine KBC submodels to classify lung nodules with an adaptive weighting scheme learned during the error back propagation, which enables us to train the MV-KBC model in an end-to-end manner.

Semi-supervised deep learning. SSL methods, which leverage unsupervised learning with unlabeled data to support the learning of supervised model, enable us to train deep models using both labeled and unlabeled data, and hence reduce the work related to image annotation (Chapelle, Scholkopf, A. Zien, 2009, Zhu, 2006). There are different choices for unsupervised learning models. Ranzato and Szummer (2008) used an unsupervised autoencoder to reconstruct the input, and shared parameters between the encoder and a classification network. Goodfellow et al. (2014) introduced an unsupervised generative adversarial network (GAN) which consists of two adversarial models: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample comes from the training data rather than G. Considering GAN as unsupervised learning model, researchers have developed a series of SSL methods. Makhzani et al. (2015) turned an autoencoder into the adversarial autoencoder (AAE) and exploited the encoder to predict the discrete class label. Springenberg (2015) modified the objective function of the discriminator, and thus proposed the categorical GAN (CatGAN), which takes into account the mutual information between observed examples and their predicted class distribution. Rasmus et al. (2015) proposed the ladder network to alleviate the adaptation problem between unsupervised generative and supervised discriminative models. This network has an auxiliary supervised output in the encoder part and skip connections from the encoder to decoder, aiming to reduce the burden on the encoding layers during unsupervised learning. Despite improved accuracy, this model still uses shared parameters in encoder part for both generative and discriminative tasks.

The main contributions of this work include: (a) the proposed SSAC model uses learnable T layers to transfer the representation ability learned by the reconstruction network R to the classification network C, abandoning the parameter sharing and feature adaption strategy, (b) the adversarial training is used in R to minimize the discrepancy between the distributions of input lung nodules and reconstructed one, and (c) the extended MK-SSAC model has been evaluated on the LIDC-IDRI dataset and achieved the state-of-the-art performance in benign–malignant lung nodule classification.

Section snippets

Experimental dataset

The LIDC-IDRI dataset (Armato, McLennan, Bidaut, McNitt-Gray, Meyer, Reeves, Zhao, Aberle, Henschke, Hoffman, Kazerooni, MacMahon, Beek, Yankelevitz, Biancardi, Bland, Brown, Engelmann, Laderach, Max, Pais, Qing, Roberts, Smith, Starkey, Batra, Caligiuri, Farooqi, Gladish, Jude, Munden, Petkovska, Quint, Schwartz, Sundaram, Dodd, Fenimore, Gur, Petrick, Freymann, Kirby, Hughes, Vande Casteele, Gupte, Sallam, Heath, Kuhn, Dharaiya, Burns, Fryd, Salganicoff, Anand, Shreter, Vastagh, Croft,

Methodology

The proposed SSAC model consists of three major modules: an adversarial autoencoder-based unsupervised reconstruction network R, a supervised classification network C, and learnable T layers (see Fig. 2). This model has been extended to the MK-SSAC model (see Fig. 4(b)) for benign–malignant lung nodule classification, which contains 27 SSAC submodels, each characterizing a nodule’s OA, HVV, and HS from each of three orthographic and six diagonal views, respectively.

Experimental design

We applied the proposed MK-SSAC model to the LIDC-IDRI dataset five times independently, with the 10-fold cross validation. The Tianchi dataset that includes 1839 unlabeled nodules was used to train each MK-SSAC model. It took about 24 h to train the MK-SSAC model and less than 0.5 s to use it to classify each nodule (Intel Xeon E5- 2640 V4 CPU, 4 NVIDIA Titan X GPU, 512GB RAM, Keras and Tensorflow). The training process is time consuming, but can be performed offline. The very fast testing

Comparison to other SSL methods

The performance of the proposed MK-SSAC model and three SSL models was compared in Table 2, which shows that the MK-SSAC model achieved the highest accuracy, sensitivity, and AUC and the second highest specificity. Especially, the significant improvement in sensitivity (minimum 1.87%) is substantial. Since a higher sensitivity indicates a lower false negative rate, our model is more suitable for lung nodule screening and potentially more useful in clinical practice than other three SSL models.

Ablation studies

The contribution of the T layers that bridge the reconstruction network and classification network has been illustrated in the first experiment. Besides T layers, the reconstruction network itself and the adversarial learning used in it play a pivotal role in the proposed MK-SSAC model. To demonstrate the contributions of these two modules, we conducted ablation studies via constructing the MK-C model and MK-SSGC model. In MK-C, each of 27 submodels contains only the classification network

Conclusion

We presented the MK-SSAC model to differentiate malignant lung nodules from benign ones on chest CT by proposing a novel semi-supervised strategy to effectively use the unlabeled nodules and taking into account the nodule’s heterogeneity in shape and voxel values on nine planar views. Experimental results on the LIDC-IDRI dataset demonstrate the effectiveness of our MK-SSAC model for improving the state-of-the-art nodule classification systems. Although our model is built upon the specific

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grants 61771397, in part by the Science and Technology Innovation Committee of Shenzhen Municipality, China, under Grants JCYJ20180306171334997, in part by Synergy Innovation Foundation of the University and Enterprise for Graduate Students in Northwestern Polytechnical University under Grants XQ201911, and in part by the Project for Graduate Innovation team of Northwestern Polytechnical University. We

References (64)

  • S.G. Armato III et al.

    Lungx challenge for computerized lung nodule classification: reflections and lessons learned

    J. Med. Imaging

    (2015)
  • Armato III, S. G., McLennan, G., Bidaut, L., McNitt-Gray, M. F., Meyer, C. R., Reeves, A. P., Clarke, L. P., 2015b....
  • P.B. Bach et al.

    Benefits and harms of CT screening for lung cancer: a systematic review

    JAMA

    (2012)
  • C. Baur et al.

    Semi-supervised deep learning for fully convolutional networks

    Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)

    (2017)
  • Bi, L., Kim, J., Ahn, E., Feng, D., 2017. Automatic skin lesion analysis using large-scale dermoscopy images and deep...
  • L. Breiman

    Random forests

    Mach. Learn.

    (2001)
  • M. Buty et al.

    Characterization of lung nodule malignancy using hybrid shape and appearance features

    Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)

    (2016)
  • O. Chapelle et al.

    Semi-supervised learning

    IEEE Trans. Neural Netw.

    (2009)
  • Y. Cheng et al.

    Semi-supervised multimodal deep learning for RGB-D object recognition

    Proc. of the 25th International Joint Conference on Artificial Intelligence (IJCAI)

    (2016)
  • K. Clark et al.

    The cancer imaging archive (TCIA): maintaining and operating a public information repository

    J. Digit. Imaging

    (2013)
  • N.C. Codella et al.

    Skin lesion analysis toward melanoma detection: a challenge at the 2017 international symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC)

    2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018)

    (2018)
  • C. Cortes et al.

    Support-vector networks

    Mach. Learn.

    (1995)
  • J. Deng et al.

    Imagenet: a large-scale hierarchical image database

    IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    (2009)
  • R. Dey et al.

    Diagnostic classification of lung nodules using 3D neural networks

    IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018)

    (2018)
  • A.K. Dhara et al.

    A combination of shape and texture features for classification of pulmonary nodules in lung ct images

    J. Digit. Imaging

    (2016)
  • Díaz, I. G., 2017. Incorporating the knowledge of dermatologists to convolutional neural networks for the diagnosis of...
  • Q. Dou et al.

    Automated pulmonary nodule detection via 3D convnets with online sample filtering and hybrid-loss residual learning

    Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)

    (2017)
  • Q. Dou et al.

    Multilevel contextual 3-D CNNs for false positive reduction in pulmonary nodule detection

    IEEE Trans. Biomed. Eng.

    (2017)
  • Q. Dou et al.

    3D deeply supervised network for automated segmentation of volumetric medical images

    Med. Image Anal.

    (2017)
  • P. Haeusser et al.

    Learning by association a versatile semi-supervised training method for neural networks

    IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    (2017)
  • F. Han et al.

    Texture feature analysis for computer-aided diagnosis on pulmonary nodules

    J. Digit. Imaging

    (2015)
  • K. He et al.

    Deep residual learning for image recognition

    The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    (2016)
  • Cited by (148)

    View all citing articles on Scopus
    View full text