Skip to main content
Log in

Multiple instance learning for lung pathophysiological findings detection using CT scans

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

Lung diseases affect the lives of billions of people worldwide, and 4 million people, each year, die prematurely due to this condition. These pathologies are characterized by specific imagiological findings in CT scans. The traditional Computer-Aided Diagnosis (CAD) approaches have been showing promising results to help clinicians; however, CADs normally consider a small part of the medical image for analysis, excluding possible relevant information for clinical evaluation. Multiple Instance Learning (MIL) approach takes into consideration different small pieces that are relevant for the final classification and creates a comprehensive analysis of pathophysiological changes. This study uses MIL-based approaches to identify the presence of lung pathophysiological findings in CT scans for the characterization of lung disease development. This work was focus on the detection of the following: Fibrosis, Emphysema, Satellite Nodules in Primary Lesion Lobe, Nodules in Contralateral Lung and Ground Glass, being Fibrosis and Emphysema the ones with more outstanding results, reaching an Area Under the Curve (AUC) of 0.89 and 0.72, respectively. Additionally, the MIL-based approach was used for EGFR mutation status prediction — the most relevant oncogene on lung cancer, with an AUC of 0.69. The results showed that this comprehensive approach can be a useful tool for lung pathophysiological characterization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. GBD 2015 Mortality and Causes of Death Collaborators (2016) Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980?2015: a systematic analysis for the global burden of disease study 2015. The lancet 388(10053):1459–1544

    Article  Google Scholar 

  2. World Health Organization. Global status report on noncommunicable diseases (2014) Number WHO/NMH/NVI/15.1. World Health Organization, 2014

  3. Ostridge K, Wilkinson TMA (2016) Present and future utility of computed tomography scanning in the assessment and management of COPD. ISSN: 13993003

  4. Pinheiro G, Pereira T, Dias C, Freitas C, Hespanhol V, Costa JL, Cunha A, Oliveira HP (2020) Identifying relationships between imaging phenotypes and lung cancer-related mutation status: EGFR and KRAS. Scientific Reports. ISSN: 20452322

  5. Gevaert O, Echegaray S, Khuong A, Hoang CD, Shrager JB, Jensen KC, Berry GJ, Guo H, Lau C, Plevritis SK, Rubin DL, Napel S, Leung AN (2017) Predictive radiogenomics modeling of EGFR mutation status in lung cancer. Scientific Reports. ISSN: 20452322

  6. Mayo Clinic: Diseases and Conditions. https://www.mayoclinic.org/diseases-conditions/. Last accessed on 04/02/2020

  7. Infante M, Lutman RF, Imparato S, Di Rocco M, Ceresoli GL, Torri V, Morenghi E, Minuti F, Cavuto S, Bottoni E, Inzirillo F, Cariboni U, Errico V, Incarbone MA, Ferraroli G, Brambilla G, Alloisio M, Ravasi G (2009) Differential diagnosis and management of focal ground-glass opacities. Europ Resp J 33(4):821–827. ISSN: 09031936

    Article  CAS  Google Scholar 

  8. Lung Cancer Guide — What You Need to Know. https://www.cancer.org/cancer/lung-cancer. Last accessed on 23/01/2020

  9. Li C, Nie S, Wang Y, Sun X (2012) Experimental investigation of fuzzy enhancement for nonsolid pulmonary nodules. In: Proceedings - 2012 IEEE symposium on robotics and applications, ISRA 2012, pp 756–759. ISBN 9781467322072

  10. Wang Z, Xu H, Sun M (2018) Deep learning based nodule detection from pulmonary CT images. In: Proceedings - 2017 10th international symposium on computational intelligence and design, ISCID 2017, volume 2018-January, pp 370–373. Institute of Electrical and Electronics Engineers Inc. ISBN 9781538 636749

  11. Bakr S, Gevaert O, Echegaray S, Ayers K, Zhou M, Shafiq M, Zheng H, Benson JA, Zhang W, Leung ANC, Kadoch M, Hoang CD, Shrager J, Quon A, Rubin DL, Sa K, Napel S (2018) Plevritis data descriptor: a radiogenomic dataset of non-small cell lung cancer. Scientific Data, 5. ISSN: 20524463

  12. Sharma SV, Bell DW, Settleman J, Haber DA (2007) Epidermal growth factor receptor mutations in lung cancer. ISSN: 1474175X

  13. Jorge SEDC, Kobayashi SS, Costa DB (2014) Epidermal growth factor receptor (EGFR) mutations in lung cancer: preclinical and clinical data. ISSN: 16784510

  14. Zou J, Lv T, S Zhu Z L u, Shen Q, L Xia J W u, Song Y, Liu H (2017) Computed tomography and clinical features associated with epidermal growth factor receptor mutation status in stage I/II lung adenocarcinoma. Thoracic Cancer 8(3):260–270. ISSN: 17597714

    Article  CAS  Google Scholar 

  15. Cheng Z, Shan F, Yang Y, Shi Y, Zhang Z (2017) CT characteristics of non-small cell lung cancer with epidermal growth factor receptor mutation: a systematic review and meta-analysis. BMC Medical Imaging, 17(1). ISSN: 14712342

  16. Li XY, Xiong JF, Jia TY, Shen TL, Hou RP, Zhao J, Fu XL (2018) Detection of epithelial growth factor receptor (EGFR) mutations on CT images of patients with lung adenocarcinoma using radiomics and/or multi-level residual convolutionary neural networks. J Thor Dis 10(12):6624–6635. ISSN: 20776624

    Article  Google Scholar 

  17. Cao Y, Xu H, Liao M, Qu Y, Xu L, Zhu D, Wang B, Tian S (2018) Associations between clinical data and computed tomography features in patients with epidermal growth factor receptor mutations in lung adenocarcinoma. Int J Clinl Oncol 23(2):249–257. ISSN: 14377772

    Article  Google Scholar 

  18. Rizzo S, Raimondi S, de Jong EEC, van Elmpt W, De Piano F, Petrella F, Bagnardi V, Jochems A, Bellomi M, Dingemans AM, Lambin P (2019) Genomics of non-small cell lung cancer (NSCLC): association between CT-based imaging features and EGFR and K-RAS mutations in 122 patients—An external validation. Europ J Radiol 110:148–155. ISSN: 18727727

    Article  Google Scholar 

  19. Das A, Nair MS, Peter SD (2020) Computer-aided histopathological image analysis techniques for automated nuclear atypia scoring of breast cancer: a review. ISSN: 1618727X

  20. Safta W, Frigui H (2019) Multiple instance learning for benign vs. malignant classification of lung nodules in CT scans. In: 2018 IEEE International symposium on signal processing and information technology, ISSPIT 2018. ISBN 9781538675687

  21. Asif A, Abbasi WA, Munir F, Ben-Hur A, ul Amir Afsar Minhas F (2017) pyLEMMINGS: large margin multiple instance classification and ranking for bioinformatics applications

  22. Zhou ZH, Sun YY, Li YF (2009) Multi-instance learning by treating instances as non-I.I.D. samples. In: ACM International conference proceeding series. ISBN 9781605585161, vol 382. ACM Press, New York, pp 1–8

  23. Doran G, Ray S (2014) A theoretical and empirical analysis of support vector machine methods for multiple-instance classification. In: Machine learning, vol 97, pp 79–102. Kluwer Academic Publishers

  24. Carbonneau MA, Cheplygina V, Granger E, Gagnon G (2018) Multiple instance learning: a survey of problem characteristics and applications. Pattern Recogn 77:329–353. ISSN: 00313203

    Article  Google Scholar 

  25. Cheplygina V, Sørensen L, Tax DMJ, Pedersen JH, Loog M, De Bruijne M (2014) Classification of COPD with multiple instance learning. In: Proceedings - international conference on pattern recognition. ISBN 9781479952083. Institute of Electrical and Electronics Engineers Inc., pp 1508–1513

  26. Gang J, Yuan F, Bing Z (2013) Medical image semantic annotation based on MIL. In: 2013 ICME International conference on complex medical engineering, CME 2013. ISBN 9781467329699, pp 85–90

  27. Ramos J, Kockelkorn T, Van Ginneken B, Viergever MA, Grutters J, Ramos R, Campilho A (2013) Learning Interstitial Lung Diseases CT Patterns from Reports Keywords. Technical report

  28. Peña IP, Cheplygina V, Paschaloudi S, Vuust M, Carl J, Møller Weinreich U, Østergaard LR, de Bruijne M (2018) Automatic emphysema detection using weakly labeled HRCT lung images. Plos One 13:10

    Article  Google Scholar 

  29. Cheplygina V, Pena IP, Pedersen JH, Lynch DA, Sorensen L, De Bruijne M (2018) Transfer learning for multicenter classification of chronic obstructive pulmonary disease. IEEE J Biomed Health Inform 22(5):486–1496. ISSN: 21682194

    Article  Google Scholar 

  30. Orting SN, Petersen J, Thomsen LH, Wille MMW, De Bruijne M (2018) Detecting emphysema with multiple instance learning. In: Proceedings - international symposium on biomedical imaging, volume 2018-April. ISBN 9781538636367. IEEE Computer Society, pp 510–513

  31. Depeursinge A, Vargas A, Platon A, Geissbuhler A, Poletti PA, Müller H (2012) Building a reference multimedia database for interstitial lung diseases. Comput Med Imag Graph 36(3):227–238. ISSN: 08956111

    Article  Google Scholar 

  32. Park SH, Ha YG (2014) Large imbalance data classification based on MapReduce for traffic accident prediction. In: Proceedings - 2014 8th international conference on innovative mobile and internet services in ubiquitous computing, IMIS 2014, ISBN 9781479943319

  33. Park Sh, Kim Sm, Ha Yg (2016) Highway traffic accident prediction using VDS big data analysis. Journal of Supercomputing. ISSN: 15730484

  34. Leevy JL, Khoshgoftaar TM, Bauder RA, Seliya N (2018) A survey on addressing high-class imbalance in big data. Journal of Big Data. ISSN: 21961115

  35. Rendon-Gonzalez E, Ponomaryov V (2016) Automatic Lung nodule segmentation and classification in CT images based on SVM. In: 9th International Kharkiv symposium on physics and engineering of microwaves, millimeter and submillimeter waves, MSMW 2016. Institute of Electrical and Electronics Engineers Inc., ISBN 9781509022663

  36. Hebb AO, Poliakov AV (2009) Imaging of deep brain stimulation leads using extended hounsfield unit CT. Stereotactic and Functional Neurosurgery 87(3):155–160. ISSN: 10116125

    Article  Google Scholar 

  37. Aresta GM (2016) Detection of juxta-pleural lung nodules in computed tomography images. Master’s thesis. Faculdade de Engenharia da Universidade do Porto, 7

  38. Bunescu RC, Mooney RJ (2007) Multiple instance learning for sparse positive bags. In: ACM International conference proceeding series, vol 227, pp 105–112

  39. Gärtner T, Flach PA, Kowalczyk AA, AlexSmola JS, Rsise A (2002) Multi-Instance Kernels. Technical report

  40. Zhou Z-H Multi-instance learning: a survey. Technical report

  41. Maron O, Ratan AL (1998) Multiple-instance learning for natural scene classiication

  42. Wei XS, Zhou ZH (2016) An empirical study on image bag generators for multi-instance learning. Mach Learn 105(2):155–198. ISSN: 15730565

    Article  Google Scholar 

  43. Zhu B, Luo W, Li B, Chen B, Yang Q, Xu Y, Wu X, Chen H, Zhang K (2014) The development and evaluation of a computerized diagnosis scheme for pneumoconiosis on digital chest radiographs. BioMedical Engineering Online. ISSN: 1475925X

  44. El Ayachy R, Giraud N, Giraud P, Durdux C, Giraud P, Burgun A, Bibault JE (2021) The role of radiomics in lung cancer: from screening to treatment and follow-up. ISSN: 2234943X

  45. Van Griethuysen JJM, Fedorov A, Parmar C, Hosny A, Aucoin N, Narayan V, Beets-Tan RGH, Fillion-Robin JC, Pieper S, Aerts HJWL (2017) Computational radiomics system to decode the radiographic phenotype. Cancer Res 77(21):e104–e107. ISSN: 15387445

    Article  Google Scholar 

  46. DenOtter TD, Schubert J (2021) Hounsfield Unit. Treasure Island (FL): StatPearls Publishing

  47. Konkol M, Śniatała K, Śniatała P, Wilk S, Baczyńska B, Milecki P (2021) Computer tools to analyze lung CT changes after radiotherapy. Applied Sciences (Switzerland). ISSN: 20763417

  48. Mera C, Arrieta J, Orozco-Alzate M, Branch J (2015) A bag oversampling approach for class imbalance in multiple instance learning. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). ISBN 9783319257501, vol 9423. Springer, pp 724–731

  49. Intro to Model Tuning: Grid and Random Search — Kaggle. https://www.kaggle.com/willkoehrsen/intro-to-model-tuning-grid-and-random-searchhttps://www.kaggle.com/willkoehrsen/intro-to-model-tuning-grid-and-random-search. Last accessed on 06/06/2020

  50. Pereira T, Freitas C, Costa JL, Morgado J, Silva F, Negrão E, de Lima BF, da Silva MC, Madureira AJ, Ramos I, Hespanhol V, Cunha A, Oliveira HP (2020) Comprehensive Perspective for Lung Cancer Characterisation Based on AI Solutions Using CT Images. Journal of Clinical Medicine

Download references

Acknowledgements

We acknowledged The Cancer Imaging Archive (TCIA) for the open-access NSCLC-Radiogenomics dataset publicly available.

We acknowledged the University Hospitals of Geneva (HUG) for the access of ILD DATABASE - Multimedia database of interstitial lung diseases.

Funding

This work is financed by the ERDF - European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme and by National Funds through the Portuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project POCI-01-0145-FEDER-030263 and a PhD Grant Number: 2021.05767.BD.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tania Pereira.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Julieta Frade and Tania Pereira contributed equally to this work

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Frade, J., Pereira, T., Morgado, J. et al. Multiple instance learning for lung pathophysiological findings detection using CT scans. Med Biol Eng Comput 60, 1569–1584 (2022). https://doi.org/10.1007/s11517-022-02526-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-022-02526-y

Keywords

Navigation