LIDP: A Lung Image Dataset with Pathological Information for Lung Cancer Screening

Shao, Yanbo; Wang, Minghao; Mai, Juanyun; Fu, Xinliang; Li, Mei; Zheng, Jiayin; Diao, Zhaoqi; Yin, Airu; Chen, Yulong; Xiao, Jianyu; You, Jian; Yang, Yang; Qiu, Xiangcheng; Tao, Jinsheng; Wang, Bo; Ji, Hua

doi:10.1007/978-3-031-16437-8_74

Yanbo Shao¹²,
Minghao Wang¹²,
Juanyun Mai¹²,
Xinliang Fu¹²,
Mei Li¹²,
Jiayin Zheng¹²,
Zhaoqi Diao¹²,
Airu Yin¹²,
Yulong Chen¹³,
Jianyu Xiao¹³,
Jian You¹³,
Yang Yang¹⁴,
Xiangcheng Qiu¹⁴,
Jinsheng Tao¹⁴,
Bo Wang¹⁴ &
…
Hua Ji^12,14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13433))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

6211 Accesses
2 Citations

Abstract

Lung cancer has been one of the greatest lethal cancers worldwide. Computed Tomograph (CT) makes it possible to diagnose lung cancer at an early stage, which can significantly reduce its mortality. In recent years, deep neural networks (DNN) have been widely used to improve the accuracy of benign and malignant pulmonary nodules classification. But the limitation of DNN approach is that AI model’s performance and generalization highly depend on the size and quality of the training data. With our best knowledge, almost all existing public lung nodule datasets, e.g., LIDC-IDRI, obtain the crucial benign and malignant labels by radiographic analysis, instead of pathological examination. In this paper, we argue that, without pathology report and hence lack of labels’ authenticity, LIDC-IDRI based machine-learning (ML) models are short of generalization. To prove our hypothesis, we introduce a new lung CT image dataset with pathological information (LIDP), for lung cancer screening. LIDP contains 990 samples, including 783 malignant samples and 207 benign samples. More critically, the labels of all samples have been all examined by pathological biopsy. We evaluate various of existing LIDC-based state-of-the-art (SOTA) models on LIDP. Our experimental results show the extreme poor generalization ability of existing SOTA models that are trained on LIDC-IDRI dataset. Our scientific conclusion is striking: the distributions of these datasets are significantly different. We claim that the LIDP dataset is a very valuable addition to the existing datasets like LIDC-IDRI. LIDP can be well used for independent testing or for training new ML models for lung cancer early detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://ncrc.gyfyy.com/index.php?ac=article &at=read &did=509.

References

Al-Shabi, M., Lan, B.L., Chan, W.Y., Ng, K.H., Tan, M.: Lung nodule classification using deep local-global networks. Int. J. Comput. Assist. Radiol. Surg. 10, 1815–1819 (2019)
Article Google Scholar
Al-Shabi, M., Lee, H.K., Tan, M.: Gated-dilated networks for lung nodule classification in CT scans. IEEE Access 7, 178827–178838 (2019)
Article Google Scholar
Ali, I., Muzammil, M., Haq, I.U., Khaliq, A.A., Abdullah, S.: Efficient lung nodule classification using transferable texture convolutional neural network. IEEE Access 8, 175859–175870 (2020)
Article Google Scholar
Armato, S.G., III., et al.: The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans. Med. Phys. 2, 915–931 (2011)
Article Google Scholar
Del Ciello, A., Franchi, P., Contegiacomo, A., Cicchetti, G., Bonomo, L., Larici, A.R.: Missed lung cancer: when, where, and why? Diagn. Intervent. Radiol. 23(2), 118 (2017)
Article Google Scholar
Dey, R., Lu, Z., Hong, Y.: Diagnostic classification of lung nodules using 3D neural networks. In: 2018 IEEE 15th International Symposium on Biomedical Imaging, pp. 774–778 (2018)
Google Scholar
Hussein, S., Cao, K., Song, Q., Bagci, U.: Risk stratification of lung nodules using 3D CNN-based multi-task learning. In: Niethammer, M., et al. (eds.) IPMI 2017. LNCS, vol. 10265, pp. 249–260. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59050-9_20
Chapter Google Scholar
Jiang, H., Shen, F., Gao, F., Han, W.: Learning efficient, explainable and discriminative representations for pulmonary nodules classification. Pattern Recogn. 107825 (2021)
Google Scholar
Kirby, J.S., et al.: LUNGx challenge for computerized lung nodule classification. J. Med. Imaging (4), 044506 (2016)
Google Scholar
Kuang, Y., Lan, T., Peng, X., Selasi, G.E., Liu, Q., Zhang, J.: Unsupervised multi-discriminator generative adversarial network for lung nodule malignancy classification. IEEE Access 8, 77725–77734 (2020)
Article Google Scholar
Lei, Y., Tian, Y., Shan, H., Zhang, J., Wang, G., Kalra, M.K.: Shape and margin-aware lung nodule classification in low-dose CT images via soft activation mapping. Med. Image Anal. 101628 (2020)
Google Scholar
Liao, F., Liang, M., Li, Z., Hu, X., Song, S.: Evaluate the malignancy of pulmonary nodules using the 3-D deep leaky noisy-or network. IEEE Trans. Neural Netw. Learn. Syst. 11, 3484–3495 (2019)
Article Google Scholar
Liu, Y., Hao, P., Zhang, P., Xu, X., Wu, J., Chen, W.: Dense convolutional binary-tree networks for lung nodule classification. IEEE Access 6, 49080–49088 (2018)
Article Google Scholar
Rorke, L.B.: Pathologic diagnosis as the gold standard (1997)
Google Scholar
Shan, H., Wang, G., Kalra, M.K., de Souza, R., Zhang, J.: Enhancing transferability of features from pretrained deep neural networks for lung nodule classification. In: Proceedings of the 2017 International Conference on Fully Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine (2017)
Google Scholar
Shen, W., et al.: Multi-crop convolutional neural networks for lung nodule malignancy suspiciousness classification. Pattern Recogn. 61, 663–673 (2017)
Article Google Scholar
Sung, H., et al.: Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: Cancer J. Clin. 71(3), 209–249 (2021)
Google Scholar
National Lung Screening Trial Research Team: Reduced lung-cancer mortality with low-dose computed tomographic screening. N. Engl. J. Med. 635(5), 395–409 (2011)
Google Scholar
Wu, G.X., Raz, D.J.: Lung cancer screening. Lung Cancer 1–23 (2016)
Google Scholar
Xie, Y., Xia, Y., Zhang, J., Feng, D.D., Fulham, M., Cai, W.: Transferable multi-model ensemble for benign-malignant lung nodule classification on chest CT. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 656–664. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66179-7_75
Chapter Google Scholar
Xie, Y., et al.: Knowledge-based collaborative deep learning for benign-malignant lung nodule classification on chest CT. IEEE Trans. Med. Imaging 4, 991–1004 (2018)
Google Scholar
Xie, Y., Zhang, J., Xia, Y.: Semi-supervised adversarial model for benign-malignant lung nodule classification on chest CT. Med. Image Anal. 57, 237–248 (2019)
Article Google Scholar
Xie, Y., Zhang, J., Xia, Y., Fulham, M., Zhang, Y.: Fusing texture, shape and deep model-learned information at decision level for automated classification of lung nodules on chest CT. Inf. Fusion 42, 102–110 (2018)
Article Google Scholar
Yang, J., Fang, R., Ni, B., Li, Y., Xu, Y., Li, L.: Probabilistic radiomics: ambiguous diagnosis with controllable shape analysis. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11769, pp. 658–666. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32226-7_73
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Advanced Medical Data Research Center, College of Computer Science, Nankai University, Tianjin, China
Yanbo Shao, Minghao Wang, Juanyun Mai, Xinliang Fu, Mei Li, Jiayin Zheng, Zhaoqi Diao, Airu Yin & Hua Ji
Department of Lung Cancer, Radiology, Tianjin Lung Cancer Center, Tianjin Medical University, Tianjin, China
Yulong Chen, Jianyu Xiao & Jian You
AnchorDx Medical Co., Guangzhou, China
Yang Yang, Xiangcheng Qiu, Jinsheng Tao, Bo Wang & Hua Ji

Authors

Yanbo Shao
View author publications
You can also search for this author in PubMed Google Scholar
Minghao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Juanyun Mai
View author publications
You can also search for this author in PubMed Google Scholar
Xinliang Fu
View author publications
You can also search for this author in PubMed Google Scholar
Mei Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiayin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoqi Diao
View author publications
You can also search for this author in PubMed Google Scholar
Airu Yin
View author publications
You can also search for this author in PubMed Google Scholar
Yulong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jianyu Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Jian You
View author publications
You can also search for this author in PubMed Google Scholar
Yang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xiangcheng Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Jinsheng Tao
View author publications
You can also search for this author in PubMed Google Scholar
Bo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hua Ji
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Airu Yin or Hua Ji .

Editor information

Editors and Affiliations

Rochester Institute of Technology, Rochester, NY, USA
Linwei Wang
Chinese University of Hong Kong, Hong Kong, Hong Kong
Qi Dou
University of Virginia, Charlottesville, VA, USA
P. Thomas Fletcher
National Center for Tumor Diseases (NCT/UCC), Dresden, Germany
Stefanie Speidel
Case Western Reserve University, Cleveland, OH, USA
Shuo Li

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 310 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shao, Y. et al. (2022). LIDP: A Lung Image Dataset with Pathological Information for Lung Cancer Screening. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13433. Springer, Cham. https://doi.org/10.1007/978-3-031-16437-8_74

Download citation

DOI: https://doi.org/10.1007/978-3-031-16437-8_74
Published: 16 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16436-1
Online ISBN: 978-3-031-16437-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

LIDP: A Lung Image Dataset with Pathological Information for Lung Cancer Screening