Abstract
Lung cancer has been one of the greatest lethal cancers worldwide. Computed Tomograph (CT) makes it possible to diagnose lung cancer at an early stage, which can significantly reduce its mortality. In recent years, deep neural networks (DNN) have been widely used to improve the accuracy of benign and malignant pulmonary nodules classification. But the limitation of DNN approach is that AI model’s performance and generalization highly depend on the size and quality of the training data. With our best knowledge, almost all existing public lung nodule datasets, e.g., LIDC-IDRI, obtain the crucial benign and malignant labels by radiographic analysis, instead of pathological examination. In this paper, we argue that, without pathology report and hence lack of labels’ authenticity, LIDC-IDRI based machine-learning (ML) models are short of generalization. To prove our hypothesis, we introduce a new lung CT image dataset with pathological information (LIDP), for lung cancer screening. LIDP contains 990 samples, including 783 malignant samples and 207 benign samples. More critically, the labels of all samples have been all examined by pathological biopsy. We evaluate various of existing LIDC-based state-of-the-art (SOTA) models on LIDP. Our experimental results show the extreme poor generalization ability of existing SOTA models that are trained on LIDC-IDRI dataset. Our scientific conclusion is striking: the distributions of these datasets are significantly different. We claim that the LIDP dataset is a very valuable addition to the existing datasets like LIDC-IDRI. LIDP can be well used for independent testing or for training new ML models for lung cancer early detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Al-Shabi, M., Lan, B.L., Chan, W.Y., Ng, K.H., Tan, M.: Lung nodule classification using deep local-global networks. Int. J. Comput. Assist. Radiol. Surg. 10, 1815–1819 (2019)
Al-Shabi, M., Lee, H.K., Tan, M.: Gated-dilated networks for lung nodule classification in CT scans. IEEE Access 7, 178827–178838 (2019)
Ali, I., Muzammil, M., Haq, I.U., Khaliq, A.A., Abdullah, S.: Efficient lung nodule classification using transferable texture convolutional neural network. IEEE Access 8, 175859–175870 (2020)
Armato, S.G., III., et al.: The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans. Med. Phys. 2, 915–931 (2011)
Del Ciello, A., Franchi, P., Contegiacomo, A., Cicchetti, G., Bonomo, L., Larici, A.R.: Missed lung cancer: when, where, and why? Diagn. Intervent. Radiol. 23(2), 118 (2017)
Dey, R., Lu, Z., Hong, Y.: Diagnostic classification of lung nodules using 3D neural networks. In: 2018 IEEE 15th International Symposium on Biomedical Imaging, pp. 774–778 (2018)
Hussein, S., Cao, K., Song, Q., Bagci, U.: Risk stratification of lung nodules using 3D CNN-based multi-task learning. In: Niethammer, M., et al. (eds.) IPMI 2017. LNCS, vol. 10265, pp. 249–260. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59050-9_20
Jiang, H., Shen, F., Gao, F., Han, W.: Learning efficient, explainable and discriminative representations for pulmonary nodules classification. Pattern Recogn. 107825 (2021)
Kirby, J.S., et al.: LUNGx challenge for computerized lung nodule classification. J. Med. Imaging (4), 044506 (2016)
Kuang, Y., Lan, T., Peng, X., Selasi, G.E., Liu, Q., Zhang, J.: Unsupervised multi-discriminator generative adversarial network for lung nodule malignancy classification. IEEE Access 8, 77725–77734 (2020)
Lei, Y., Tian, Y., Shan, H., Zhang, J., Wang, G., Kalra, M.K.: Shape and margin-aware lung nodule classification in low-dose CT images via soft activation mapping. Med. Image Anal. 101628 (2020)
Liao, F., Liang, M., Li, Z., Hu, X., Song, S.: Evaluate the malignancy of pulmonary nodules using the 3-D deep leaky noisy-or network. IEEE Trans. Neural Netw. Learn. Syst. 11, 3484–3495 (2019)
Liu, Y., Hao, P., Zhang, P., Xu, X., Wu, J., Chen, W.: Dense convolutional binary-tree networks for lung nodule classification. IEEE Access 6, 49080–49088 (2018)
Rorke, L.B.: Pathologic diagnosis as the gold standard (1997)
Shan, H., Wang, G., Kalra, M.K., de Souza, R., Zhang, J.: Enhancing transferability of features from pretrained deep neural networks for lung nodule classification. In: Proceedings of the 2017 International Conference on Fully Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine (2017)
Shen, W., et al.: Multi-crop convolutional neural networks for lung nodule malignancy suspiciousness classification. Pattern Recogn. 61, 663–673 (2017)
Sung, H., et al.: Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: Cancer J. Clin. 71(3), 209–249 (2021)
National Lung Screening Trial Research Team: Reduced lung-cancer mortality with low-dose computed tomographic screening. N. Engl. J. Med. 635(5), 395–409 (2011)
Wu, G.X., Raz, D.J.: Lung cancer screening. Lung Cancer 1–23 (2016)
Xie, Y., Xia, Y., Zhang, J., Feng, D.D., Fulham, M., Cai, W.: Transferable multi-model ensemble for benign-malignant lung nodule classification on chest CT. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 656–664. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66179-7_75
Xie, Y., et al.: Knowledge-based collaborative deep learning for benign-malignant lung nodule classification on chest CT. IEEE Trans. Med. Imaging 4, 991–1004 (2018)
Xie, Y., Zhang, J., Xia, Y.: Semi-supervised adversarial model for benign-malignant lung nodule classification on chest CT. Med. Image Anal. 57, 237–248 (2019)
Xie, Y., Zhang, J., Xia, Y., Fulham, M., Zhang, Y.: Fusing texture, shape and deep model-learned information at decision level for automated classification of lung nodules on chest CT. Inf. Fusion 42, 102–110 (2018)
Yang, J., Fang, R., Ni, B., Li, Y., Xu, Y., Li, L.: Probabilistic radiomics: ambiguous diagnosis with controllable shape analysis. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11769, pp. 658–666. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32226-7_73
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Shao, Y. et al. (2022). LIDP: A Lung Image Dataset with Pathological Information for Lung Cancer Screening. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13433. Springer, Cham. https://doi.org/10.1007/978-3-031-16437-8_74
Download citation
DOI: https://doi.org/10.1007/978-3-031-16437-8_74
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16436-1
Online ISBN: 978-3-031-16437-8
eBook Packages: Computer ScienceComputer Science (R0)