Elsevier

Knowledge-Based Systems

Volume 189, 15 February 2020, 105128
Knowledge-Based Systems

DeepLN: A framework for automatic lung nodule detection using multi-resolution CT screening images

https://doi.org/10.1016/j.knosys.2019.105128Get rights and content

Abstract

Computed tomography (CT) is an important and valuable tool for detecting and diagnosing lung cancer at an early stage. Commonly, CT screenings with lower dose and resolution are used for preliminary screening. In particular, many hospitals in smaller towns only provide CT screenings at low resolution. However,when patients are diagnosed with suspected cancer, they are transferred or recommended to larger hospitals for more sophisticated examinations with high-resolution CT scans. Therefore, multi-resolution CT images deserve attention and are critical in clinical practice. Currently, the available open source datasets only contain high-resolution CT screening images. To address this problem, a multi-resolution CT screening image dataset called the DeepLNDataset is constructed. A three-level labeling criterion and a semi-automatic annotation system are presented to guarantee the correctness and efficiency of lung nodule annotation. Moreover, a novel framework called DeepLN is proposed to detect lung nodules in both low-resolution and high-resolution CT screening images. The multi-level features are extracted by a neural-network based detector to locate the lung nodules. Hard negative mining and a modified focal loss function are employed to solve the common category imbalance problem. A novel non-maximum suppression based ensemble strategy is proposed to synthesize the results from multiple neural network models trained on CT image datasets of different resolutions. To the best of our knowledge, this is the first work that considers the influence of multiple resolutions on lung nodule detection. The experimental results demonstrate that the proposed method can address this issue well.

Introduction

Lung cancer is the deadliest cancer in the world [1]. A computed tomography (CT) screening is a quick and painless procedure that produces clear images of the inside of the lungs. CT screening is widely used to help diagnose and monitor treatment for a variety of pulmonary diseases such as lung cancer. These diseases manifest in images as lung nodules. To find the lung nodules in CT screening images for further diagnosis, an experienced radiologists must carefully read the CT screening images slice by slice. This process requires substantial time and effort. Furthermore, there are not enough experienced experts to provide high quality medical services for many patients. Therefore, automatic detection of lung nodules is an important research topic in computer-assisted diagnosis.

Because of the high potential value of automatic lung nodule detection, many efforts have been put into this research in recent years. However, lower-dose CT scans with lower resolution are used for preliminary screening, especially in many hospitals in smaller towns that may only provide low-dose CT scans. In contrast, the available open source datasets only contain high-resolution CT screening images. To address this problem, we constructed a multi-resolution CT screening image dataset called the DeepLNDataset.

The construction of such a dataset requires a large number of radiologist annotations. The annotation process, as in clinical practice, is often based on the radiologists’ experience, however, radiologists have differing opinions about lung nodule annotation. Employing an effective annotation method is the key to guarantee the objectivity and accuracy of labeling [2], [3], [4], [5], [6], [7], [8]. For example, in [8], each case was blindly marked by four radiologists. The LUNA2016 challenge used the same parts of all radiologists’ annotations; this simple approach causes some real nodules to be omitted. In [7], the first round of annotation was accomplished by a CAD system. Then, in the second round, the initial labels were reviewed by two medical students to identify nodules. Since they lacked clinical experience, the accuracy of those results cannot be guaranteed. The CAD system used in [7] was threshold-based, which could not both guarantee sensitivity and reduce false positives (FPs). In contrast, in the dataset presented in this study, a three-level annotation method is employed to produce first annotations on a part of the dataset. Then, an initial detector is trained on these data to boost the semi-automatic annotation process. Because the annotations rely on the radiologist’s clinical experience, this method ensures the accuracy and efficiency of the annotations.

In addition, methods have been employed to construct automatic lung nodule detection systems. Lung nodules only appear inside the lung regions, so effective lung region segmentation can avoid the detection of lesions outside lung regions, reducing FPs. Threshold-based methods are the most common for segmenting the lung regions [9] and works well. The subsequent lung nodule detection stage consists of two models: volume of interest (VOI) detection models that guarantee the maximum sensitivity of the subsequent stages and classifier models for reducing FPs. Advances in this research can be divided into three periods. In the first and earliest period, neither the detectors nor classifiers proposed were based on neural networks. All detection models were threshold-based methods [2], [4], [10], [11], [12], [13] such as lung segmentation methods. Threshold-based methods to select VOIs are more complex than lung region segmentation methods because lung nodules have more diverse shapes and edges. Then, simple linear or non-linear classifiers were trained to determine whether the selected VOIs are lung nodules. Their inputs were complicated hand-designed features, and not all of the essential features can be extracted to maximize the classifiers’ performance. In the second period of research, convolutional neural networks (CNNs) [14], [15], [16], [17] were employed to reduce the number of FPs. The detector methods employed during this period were still threshold-based, but they were more complex. In the third and most recent period, both detectors and classifiers are neural network-based models based on [9], [16], [18]. These studies proposed excellent methods for constructing effective models and obtained state-of-the-art results on open-source datasets. Nevertheless, the available open-source datasets only contain high-resolution CT screening images whose thicknesses range from 1.25 to 3 mm [4], and the studies mentioned above did not consider the problems caused by multi-resolution CT screening images. However, to reduce radiation injury caused by CT screening images in clinical practice, lower-resolution CT screening images are acquired for physical examinations. The dataset collected in this study contains CT screening images at two resolutions: a 1 mm thickness for thin-section images and a 5 mm thickness for thick-section images. A method to address the multi-resolution problem is proposed in this study. First, the DeepLNDataset’s thin-section data and thick-section data are separated. Two separate detectors are trained, each using one of the two subsets reduces the influence of multiple resolutions. To extract the features from CT screening images effectively, a residual neural network is employed as the backbone. Multi-level feature fusion can promote the nodule detection’s accuracy. Next, an ensemble method is proposed to improve the results of the two models.

The contributions of this work can be summarized as follows:

  • 1.

    A framework for automatic lung nodule detection from multi-resolution CT screening images is proposed that has obtained promising results in clinical practice.

  • 2.

    A three-level annotation criterion and a semi-automatic annotation system are proposed to construct a multi-resolution CT screening image dataset called the DeepLNDataset.

  • 3.

    The influence of CT screenings with different resolutions are analyzed in depth, and an ensemble strategy is proposed to tackle this issue.

The rest of this paper is organized as follows. Section 2 reviews related work. Section 3 describes the construction of the DeepLNDataset. Section 4 presents the DeepLN method. Evaluations of our proposed methods and some analysis are given in Section 5. We conclude and discuss future work in Section 6.

Section snippets

Object detection in natural images

With the development of deep learning, significant advances in the study of object detection in natural images have been obtained.

Region-based convolutional neural networks (RCNNs) were proposed in [19]. This method devised the first type of CNN to successfully detect objects. This method employs a fine-tuned model to extract a region’s features and determine whether or not the proposed region is an object. RCNNs represented a huge leap in progress in the field of object detection research.

Dataset

The data in the DeepLNDataset were provided by the West China Hospital, Sichuan University, China. All of the CT screening images were collected from patients when admitted to the hospital or at follow-up. To guarantee the accuracy of the dataset, not all available CT screening images were included in the dataset . The goal of this study is to assist radiologists with detection of lung nodules before treatment. Hence, postoperative cases were removed because operations can cause changes to lung

DeepLN

Automatic lung nodule detection can be regarded as an object detection task in which the input is CT screening images I and the output the lung nodule locations, consisting of four numbers [x,y,z,d]. Here, x,y, and z indicate the coordinates of the bounding box center in the 3D cubes and d indicates the diameter of the lung nodule. In this work, we aimed to construct a mapping F from I to [x,y,z,d].

To reach this goal, a lung nodule detector called DeepLN is proposed, as shown in Fig. 6. A

Experiments

In this section, we present the results of experiments conducted on the DeepLNDataset and evaluate the effectiveness of the proposed method. First, several methods were compared and their performances were analyzed. Different input sizes and different combinations of multi-level features were employed to train each detector separately. The corresponding results were analyzed. The proposed ensemble method is also evaluated. Some detection results are presented as figures, and qualitative

Conclusions and future work

In this study, we constructed a multi-resolution CT image dataset and presented an automatic lung nodule detection framework named DeepLN to assist clinical physicians. First, to prepare a large-scale dataset, a three-level annotation criterion was proposed to construct a dataset that guarantees the accuracy of labeling. To improve the efficiency of labeling, a semi-automatic annotation system was constructed. Second, to detect lung nodules in a clinical dataset, a method was proposed to train

Acknowledgments

This work was supported by the National Major Science and Technology Projects of China under Grant 2018AAA0100201 and by the Science and Technology Project of Chengdu, PR China under Grant 2017-CY02-00030-GX.

References (46)

  • PastorinoU. et al.

    Annual or biennial ct screening versus observation in heavy smokers: 5-year results of the mild trial.

    Eur. J. Cancer Prev.

    (2012)
  • CiompiF. et al.

    Towards automatic pulmonary nodule management in lung cancer screening with deep learning

    Sci. Rep.

    (2017)
  • ArmatoR.S. et al.

    The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans

    Med. Phys.

    (2015)
  • LiaoF. et al.

    Evaluate the malignancy of pulmonary nodules using the 3D deep leaky Noisy-or network

    (2017)
  • F. Society, A.A. Bankier, C.J. Herold, J.H. Austin, W.D. Travis, Recommendations for the Management of Subsolid...
  • ZhaoY. et al.

    Performance of computer-aided detection of pulmonary nodules in low-dose CT: comparison with double reading by nodule volume

    Eur. Radiol.

    (2012)
  • A.A.A. Setio, C. Jacobs, J. Gelderblom, B. Ginneken, Automatic detection of large pulmonary solid nodules in thoracic...
  • van GinnekenB. et al.

    Off-the-shelf convolutional neural network features for pulmonary nodule detection in computed tomography scans

    2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI)

    (2015)
  • ArindraA. et al.

    Pulmonary nodule detection in CT images: False positive reduction using multi-view convolutional networks

    IEEE Trans. Med. Imaging

    (2016)
  • CaiJ. et al.

    Improving deep pancreas segmentation in CT and MRI images via recurrent neural contextual learning and direct loss function

    (2017)
  • DouQ. et al.

    Multilevel contextual 3-D CNNs for false positive reduction in pulmonary nodule detection

    IEEE Trans. Biomed. Eng.

    (2017)
  • DouQ. et al.

    Automated pulmonary nodule detection via 3D convnets with online sample filtering and hybrid-loss residual learning

  • GirshickR. et al.

    Rich feature hierarchies for accurate object detection and semantic segmentation

    (2014)
  • Cited by (0)

    No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys.2019.105128.

    View full text