Skip to main content

Advertisement

Log in

Machine Learning Based Framework for Lung Cancer Detection and Image Feature Extraction Using VGG16 with PCA on CT-Scan Images

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Lung cancer causes one of the highest mortality rates worldwide, among both men and women. The situation demands for new early detection approaches, to facilitate more accurate diagnoses and treatments. In this study, we aim to increase the lung cancer diagnosis performance, by combining ensemble learning with image analysis to form a detection model. More specifically, we propose an approach that aims at detecting lung cancer from CT-scan images using an ensemble model. The core methodology used here is that VGG16 model is used to do the feature extraction. VGG16 is preferred because it uses small 3 × 3 kernels which helps to capture as many as details of images and thus gives state of the art performance for transfer learning tasks. Since, the features after applying VGG16 are highly dimensional, and hence to make processing easy, we should convert them into lower dimensions. This particular task could be achieved by using the Principal Component Analysis (PCA) technique- which uses some linear algebraic concepts to automate the process of reducing the dimensionality of any set of features efficiently. The ensemble learning technique is used to increase the predictive accuracy by combining different classification algorithms. The proposed work performs combination among LR, GNB, and RF classifiers. Similarly, VGG16 is used for feature extraction, PCA for removing of correlated features in feature subset and dimensionality reduction and ensemble are used to obtain higher accurate robust lung cancer detection system. Results show that our ensemble model outperforms other models, with an accuracy of 97.8% to determine whether lung cancer is present or not. The proposed model improved the accuracy 1.3% form the existing model. In addition to being compared, the model also proposed several innovations: (i) VGG16 is used as a base model for feature selection because it has small receptive fields so it can be effective and pre-trained features for the purpose of transfer learning; (ii) PCA is constructed on top of VGG16 in order to simplify and make the model more effective; (iii) ensemble modelling techniques were applied to improve the classification accuracy of base classifiers. In conclusion, this study contributes to medical diagnostics by demonstrating the potential of integrating VGG16, PCA, and ensemble learning for developing a lung cancer detection model with high accuracy. It is believed that such a developed diagnostic tool can help in advancing the performance of automated lung cancer diagnostic systems for better patient outcome.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data Availability

Data used in this study is obtainable on request.

References

  1. Zappa C, Mousa SA. Non-small cell lung cancer: current treatment and future advances. Transl Lung Cancer Res. 2016;5(3):288–300. https://doi.org/10.21037/tlcr.2016.06.07.

    Article  Google Scholar 

  2. Woźniak M, Połap D, Capizzi G, Lo Sciuto G, Kośmider L, Frankiewicz K. Small lung nodules detection based on local variance analysis and probabilistic neural network. Comput Methods Programs Biomed. 2018;161:173–80. https://doi.org/10.1016/j.cmpb.2018.04.025.

    Article  Google Scholar 

  3. Tyagi S, Tyagi N, Choudhury A, Gupta G, Zahra MMA, Rahin SA. Identification and classification of prostate cancer identification and classification based on improved convolution neural network. BioMed Res Int. 2022. https://doi.org/10.1155/2022/9112587.

    Article  Google Scholar 

  4. Hart GR, Roffman DA, Decker R, Deng J. A multi-parameterized artificial neural network for lung cancer risk prediction. PLoS ONE. 2018;13(10):1–13. https://doi.org/10.1371/journal.pone.0205264.

    Article  Google Scholar 

  5. Saba T. Automated lung nodule detection and classification based on multiple classifiers voting. Microsc Res Tech. 2019;82(9):1601–9. https://doi.org/10.1002/jemt.23326.

    Article  Google Scholar 

  6. Kirienko M, Sollini M, Silvestri G. Convolutional neural networks promising in lung cancer T-parameter assessment on baseline FDG-PET/CT. Contrast Media Mol Imaging. 2018;2018:1382309. https://doi.org/10.1155/2018/1382309.

    Article  Google Scholar 

  7. Cancer. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/cancer. Accessed 30 Nov 2023.

  8. Shaukat F, Raja G, Ashraf R, Khalid S, Ahmad M, Ali A. Artificial neural network based classification of lung nodules in CT images using intensity, shape and texture features. J Ambient Intell Humaniz Comput. 2019;10(10):4135–49. https://doi.org/10.1007/s12652-019-01173-w.

    Article  Google Scholar 

  9. Kanavati F, Toyokawa G, Momosaki S, Rambeau M, Kozuma Y. Weakly-supervised learning for lung carcinoma classification using deep learning. Sci Rep. 2020. https://doi.org/10.1038/s41598-020-66333-x.

    Article  Google Scholar 

  10. Kumar S, Singh K, Kumar S, Kaiwartya O, Cao Y, Zhou H. Delimitated anti jammer scheme for internet of vehicle: machine learning based security approach. IEEE Access. 2019;7:113311–23. https://doi.org/10.1109/ACCESS.2019.2934632.

    Article  Google Scholar 

  11. Sibille L, Spottiswoode B. F-FDG PET / CT uptake classification in lymphoma and lung cancer by using deep convolutional neural, 2020.

  12. Kumar S, Choudhary S, Jain A, Singh K, Ahmadian A, Bajuri MY. Brain tumor classification using deep neural network and transfer learning. Brain Topogr. 2023;36(3):305–18. https://doi.org/10.1007/s10548-023-00953-0.

    Article  Google Scholar 

  13. Arulmurugan R, Anandakumar H. Early detection of lung cancer using wavelet feature descriptor and feed forward back propagation neural networks classifier. Lect Notes Comput Vis Biomech. 2018;28:103–10. https://doi.org/10.1007/978-3-319-71767-8_9.

    Article  Google Scholar 

  14. ALzubi JA, Bharathikannan B, Tanwar S, Manikandan R, Khanna A, Thaventhiran C. Boosted neural network ensemble classification for lung cancer disease diagnosis. Appl Soft Comput J. 2019;80:579–91. https://doi.org/10.1016/j.asoc.2019.04.031.

    Article  Google Scholar 

  15. Zhang C, Sun X, Dang K, Li K, Guo X-W. Toward an expert level of lung cancer detection and classification using a deep convolutional neural network. The Oncologist. 2019;24(9):1159–65. https://doi.org/10.1634/theoncologist.2018-0908.

    Article  Google Scholar 

  16. Nishio M, Sugiyama O, Yakami M, Ueno S, One TK-P. Computer-aided diagnosis of lung nodule classification between benign nodule, primary lung cancer, and metastatic lung cancer at different image size using, Journals.Plos.Org, 2018.

  17. Bonavita I, Rafael-Palou X, Ceresa M, Piella G, Ribas V, González Ballester MA. Integration of convolutional neural networks for pulmonary nodule malignancy assessment in a lung cancer classification pipeline. Comput Methods Programs Biomed. 2020. https://doi.org/10.1016/j.cmpb.2019.105172.

    Article  Google Scholar 

  18. Onishi Y, Teramoto A, Tsujimoto M, Tsukamoto T, Saito K, Toyama H. Multiplanar analysis for pulmonary nodule classification in CT images using deep convolutional neural network and generative adversarial networks. Int J Comput Assist Radiol Surg. 2019. https://doi.org/10.1007/s11548-019-02092-z.

    Article  Google Scholar 

  19. Schwyzer M, Ferraro DA, Muehlematter UJ, Curioni-Fontecedro A, Huellner MW, von Schulthess GK. Automated detection of lung cancer at ultralow dose PET/CT by deep neural networks - Initial results. Lung Cancer Amst Neth. 2018;126:170–3. https://doi.org/10.1016/j.lungcan.2018.11.001.

    Article  Google Scholar 

  20. Dritsas E, Trigka M. Lung cancer risk prediction with machine learning models. Big Data Cogn Comput. 2022;6(4):4. https://doi.org/10.3390/bdcc6040139.

    Article  Google Scholar 

  21. Sarkar O, Islam MR, Syfullah MK, Islam MT, Ahamed MF, Ahsan M. Multi-Scale CNN: an explainable AI-integrated unique deep learning framework for lung-affected disease classification. Technologies. 2023;11(5):5. https://doi.org/10.3390/technologies11050134.

    Article  Google Scholar 

  22. El Lel T, Ahsan M, Haider J. Detecting COVID-19 from chest X-rays using convolutional neural network ensembles. Computers. 2023;12(5):5. https://doi.org/10.3390/computers12050105.

    Article  Google Scholar 

  23. Qadri SF, Lin H, Shen L, Ahmad M, Qadri S, Khan S. CT-based automatic spine segmentation using patch-based deep learning. Int J Intell Syst. 2023;2023:e2345835. https://doi.org/10.1155/2023/2345835.

    Article  Google Scholar 

  24. Ukwuoma CC, Qin Z, Heyat MBB, Akhtar F, Smahi A, Jackson JK. Automated lung-related pneumonia and COVID-19 detection based on novel feature extraction framework and vision transformer approaches using chest X-ray images. Bioengineering. 2022;9(11):11. https://doi.org/10.3390/bioengineering9110709.

    Article  Google Scholar 

  25. Ukwuoma CC, Qin Z, Belal Bin Heyat M, Akhtar F, Bamisile O, Muaad AY. A hybrid explainable ensemble transformer encoder for pneumonia identification from chest X-ray images. J Adv Res. 2022;48:191–211. https://doi.org/10.1016/j.jare.2022.08.021.

    Article  Google Scholar 

  26. Bin Heyat MB, Akhtar F, Khan A, Noor A, Benjdira B, Qamar Y. A novel hybrid machine learning classification for the detection of bruxism patients using physiological signals. Appl Sci. 2020. https://doi.org/10.3390/app10217410.

    Article  Google Scholar 

  27. Ukwuoma C, Urama G, Qin Z, Heyat MBB, Khan H, Akhtar F. Boosting breast cancer classif microscopic images using atten mechanism. 2022. https://doi.org/10.1109/DASA54658.2022.9765013.

  28. Jakimovski G, Davcev D. Using double convolution neural network for lung cancer stage detection. Appl Sci Switz. 2019. https://doi.org/10.3390/app9030427.

    Article  Google Scholar 

  29. Kashf WA, Okasha N, Sahyoun A, El-Rabi E, Bashhar B. Ann for Predicting Dna Lung Cancer. Int J Acad Pedagog Res IJAPR. 2017;10(2):6–13.

    Google Scholar 

  30. Sannasi Chakravarthy SR, Rajaguru H. Lung cancer detection using probabilistic neural network with modified crow-search algorithm. Asian Pac J Cancer Prev. 2019;20:2159–66. https://doi.org/10.31557/APJCP.2019.20.7.2159.

    Article  Google Scholar 

  31. Varadharajan R, Priyan MK, Panchatcharam P, Vivekanandan S, Gunasekaran M. A new approach for prediction of lung carcinoma using back propogation neural network with decision tree classifiers. J Ambient Intell Humaniz Comput. 2018;0(0):0. https://doi.org/10.1007/s12652-018-1066-y.

    Article  Google Scholar 

  32. Nasser IM, Abu-Naser SS. Lung cancer detection using artificial neural network. Int J Eng Inf Syst (IJEAIS). 2019;3(3):17–23.

    Google Scholar 

  33. Diaz JM, Pinon RC, Solano G. Lung cancer classification using genetic algorithm to optimize prediction models. In: IISA 2014-5th Int Conf Inf Intell Syst Appl. 2014. https://doi.org/10.1109/IISA.2014.6878770.

  34. Saha A, Ganie SM, Pramanik PKD, Yadav RK, Mallik S, Zhao Z. VER-Net: a hybrid transfer learning model for lung cancer detection using CT scan images. BMC Med Imaging. 2024;24(1):120. https://doi.org/10.1186/s12880-024-01238-z.

    Article  Google Scholar 

  35. Nasir M, Farid MS, Suhail Z, Khan MH. Optimal thresholding for multi-window computed tomography (CT) to predict lung cancer. Appl Sci. 2023;13(12):12. https://doi.org/10.3390/app13127256.

    Article  Google Scholar 

  36. Shafi I, et al. An effective method for lung cancer diagnosis from CT scan using deep learning-based support vector network. Cancers. 2022;14(21):21. https://doi.org/10.3390/cancers14215457.

    Article  Google Scholar 

  37. Yang JW, Song DH, An HJ, Seo SB. Classification of subtypes including LCNEC in lung cancer biopsy slides using convolutional neural network from scratch. Sci Rep. 2022;12(1):1830. https://doi.org/10.1038/s41598-022-05709-7.

    Article  Google Scholar 

  38. Gantenapati CS, Usharani T. Classification of normal and nodule lung images from LIDC-IDRI datasets using SVM and NB classifiers. AIP Conf Proc. 2023;2655(1):020103. https://doi.org/10.1063/5.0134443.

    Article  Google Scholar 

  39. Wankhade S. A novel hybrid deep learning method for early detection of lung cancer using neural networks. Healthc Anal. 2023;3:100195. https://doi.org/10.1016/j.health.2023.100195.

    Article  Google Scholar 

  40. Deep learning radiomics model based on PET/CT predicts. PD-L1 expression in non-small cell lung cancer. Eur J Radiol Open. [Online]. Available: https://www.ejropen.com/article/S2352-0477(24)00004-2/fulltext. Accessed 31 Aug 2024.

  41. Thirunavukkarasu MK, Karuppasamy R. Forecasting determinants of recurrence in lung cancer patients exploiting various machine learning models. J Biopharm Stat. 2023;33(3):3. https://doi.org/10.1080/10543406.2022.2148162.

    Article  Google Scholar 

  42. Akkur E, Öztürk AC. Predicting lung cancer using explainable artificial intelligence and boruta-shap, methods. Kahramanmaraş Sütçü İmam Üniv Mühendis Bilim Derg. 2024;27(3):3.

    Google Scholar 

  43. Bishnoi V, Goel N. Transfer learning-based classification model for the computed tomography scan pulmonary images. Multimed Tools Appl. 2024. https://doi.org/10.1007/s11042-024-19098-8.

    Article  Google Scholar 

  44. Bishnoi V, Goel N. Tensor-RT-based transfer learning model for lung cancer classification. J Digit Imaging. 2023;36(4):1364–75. https://doi.org/10.1007/s10278-023-00822-z.

    Article  Google Scholar 

  45. Balaji GN, Kovendan AKP, Nayak K, Venkatesan R, Yuvaraj D. Multi-cancer detection using deep learning techniques. In: Machine learning and generative AI in smart healthcare. IGI Global; 2024. p. 281–304. https://doi.org/10.4018/979-8-3693-3719-6.ch014.

    Chapter  Google Scholar 

  46. Dadgar S, Neshat M. Comparative hybrid deep convolutional learning framework with transfer learning for diagnosis of lung cancer. In: Abraham A, Hanne T, Gandhi N, Manghirmalani Mishra P, Bajaj A, Siarry P (Eds) Proceedings of the 14th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2022). Springer Nature Switzerland, Cham. 2023, pp. 296–305. https://doi.org/10.1007/978-3-031-27524-1_28

  47. Dinesh Reddy B, Thirupathi Rao N, Bhattacharyya D. Deep neural transfer network technique for lung cancer detection. In: Sisodia DS, Garg L, Pachori RB, Tanveer M, editors. Machine intelligence techniques for data analysis and signal processing. Singapore: Springer Nature; 2023. p. 237–47. https://doi.org/10.1007/978-981-99-0085-5_20.

    Chapter  Google Scholar 

  48. Tandon R, Agrawal S, Raghuwanshi R, Rathore NPS, Prasad L, Jain V. Automatic lung carcinoma identification and classification in CT images using CNN deep learning model. In: Mishra S, Tripathy HK, Mallick P, Shaalan K, editors. Augmented intelligence in healthcare: a pragmatic and integrated analysis. Singapore: Springer Nature; 2022. p. 143–66. https://doi.org/10.1007/978-981-19-1076-0_9.

    Chapter  Google Scholar 

  49. Shah AA, Malik HAM, Muhammad A, Alourani A, Butt ZA. Deep learning ensemble 2D CNN approach towards the detection of lung cancer. Sci Rep. 2023;13(1):2987. https://doi.org/10.1038/s41598-023-29656-z.

    Article  Google Scholar 

  50. IQ-OTH/NCCD - Lung cancer dataset. [Online]. Available: https://www.kaggle.com/datasets/adityamahimkar/iqothnccd-lung-cancer-dataset. Accessed 16 Jan 2024.

  51. Batista GEAPA, Prati RC, Monard MC. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor Newsl. 2004;6(1):20–9. https://doi.org/10.1145/1007730.1007735.

    Article  Google Scholar 

  52. Markopoulos PP, Kundu S, Chamadia S, Pados DA. Efficient L1-norm principal-component analysis via bit flipping. IEEE Trans Signal Process. 2017;65(16):4252–64. https://doi.org/10.1109/TSP.2017.2708023.

    Article  MathSciNet  Google Scholar 

  53. Shanbhag GA, Prabhu KA, Reddy NVS, Rao BA. Prediction of lung cancer using ensemble classifiers. J Phys Conf Ser. 2022;2161(1):012007. https://doi.org/10.1088/1742-6596/2161/1/012007.

    Article  Google Scholar 

  54. Narawade V, Singh A, Shrivastava M, Prasad A. Lung cancer prediction using ensemble learning. Int J Sci Res Comput Sci Eng Inf Technol. 2021. https://doi.org/10.32628/CSEIT217357.

    Article  Google Scholar 

  55. Faisal MI, Bashir S, Khan ZS, Hassan Khan F. An evaluation of machine learning classifiers and ensembles for early stage prediction of lung cancer. In: 2018 3rd Int. Conf. Emerg. Trends Eng. Sci. Technol. ICEEST, pp. 1–4, Dec. 2018. https://doi.org/10.1109/ICEEST.2018.8643311

Download references

Funding

We certify that this article is not associated with any kind of financial support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amit Singh.

Ethics declarations

Conflict of interest

Authors declares there is no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singh, A., Dwivedi, R.K. & Rastogi, R. Machine Learning Based Framework for Lung Cancer Detection and Image Feature Extraction Using VGG16 with PCA on CT-Scan Images. SN COMPUT. SCI. 5, 1040 (2024). https://doi.org/10.1007/s42979-024-03414-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-024-03414-y

Keywords

Navigation