Artificial intelligence in healthcare operations to enhance treatment outcomes: a framework to predict lung cancer prognosis

Johnson, Marina; Albizri, Abdullah; Simsek, Serhat

doi:10.1007/s10479-020-03872-6

Artificial intelligence in healthcare operations to enhance treatment outcomes: a framework to predict lung cancer prognosis

S.I.: Artificial Intelligence in Operations Management
Published: 23 November 2020

Volume 308, pages 275–305, (2022)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

1711 Accesses
26 Citations
Explore all metrics

Abstract

Artificial Intelligence (AI) is critical for data-driven decision making to increase resource utilization, operational performance, and service quality in various industry domains, particularly in healthcare. Using AI in healthcare operations can significantly improve treatment outcomes and enhance patient satisfaction while reducing costs. In this paper, we propose a multi-stage framework to build an AI-based decision support tool that can predict the 5-year survivability of lung cancer patients. We evaluate the proposed framework using the Surveillance, Epidemiology, and End Results dataset pertaining to the 1973–2015 period obtained from the National Institutes of Health. The first stage entails data preprocessing and target creation. The second stage applies six AI algorithms with feature selection through Particle Swarm Optimization and hyperparameter tuning with Cross-Validation. These Algorithms include Logistic Regression, Decision Trees, Random Forests (RF), Adaptive Boosting (AdaBoost), Artificial Neural Network, and Naïve Bayes. The results show that RF and AdaBoost models yield an AUC rate of 0.94 and outperform the other models. Stage 3 utilizes permutation importance to interpret the RF and AdaBoost models and applies Tree-based Augmented Naïve Bayes to gain insights regarding the interrelations among important features. The results of Stage 3 delineate that the number of lymph nodes containing metastases), the number of tumors that patients have had in their lifetime, the patient’s age, and the microscopic composition of cells rank among the topmost important features and can significantly impact patient survivability. We think this study has significant practical implications in helping physicians predict prognosis and develop treatment plans for lung cancer patients.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning and Artificial Intelligence: Definitions, Applications, and Future Directions

Article 25 January 2020

Machine learning for risk stratification of thyroid cancer patients: a 15-year cohort study

Article 30 October 2023

Clinical applications of artificial intelligence and machine learning in cancer diagnosis: looking into the future

Article Open access 21 May 2021

References

Agrawal, A., Misra, S., Narayanan, R., Polepeddi, L., & Choudhary, A. (2012). Lung cancer survival prediction using ensemble data mining on SEER data 1. Scientific Programming, 20, 29–42. https://doi.org/10.3233/SPR-2012-0335.
Article Google Scholar
Akaike, H. (1998). Information theory and an extension of the maximum likelihood principle (pp. 199–213). New York, NY: Springer. https://doi.org/https://doi.org/10.1007/978-1-4612-1694-0_15
Akter, S., Michael, K., Uddin, M. R., et al. (2020). Transforming business using digital innovations: the application of AI, blockchain, cloud and data analytics. Annals of Operations Research. https://doi.org/10.1007/s10479-020-03620-w.
Article Google Scholar
American Association for Cancer Research. (2018). Lung cancer mortality rates among women projected to increase by over 40 percent by 2030. ScienceDaily. https://www.sciencedaily.com/releases/2018/08/180801084051.htm. Accessed November 18, 2019
American Cancer Society. (2020). Key Statistics for Lung Cancer.
American Society of Clinical Oncology. (2020). Understanding statistics used to guide prognosis and evaluate treatment.
Bawack, R., Wamba, S., & Carillo, K. (2019). Artificial intelligence in practice: Implications for information systems research. In Americas conference on information systems. Cancun. https://www.researchgate.net/publication/333853703_Artificial_Intelligence_in_Practice_Implications_for_Information_Systems_Research. Accessed March 13, 2020
Bermingham, M. L., Pong-Wong, R., Spiliopoulou, A., Hayward, C., Rudan, I., Campbell, H., et al. (2015). Application of high-dimensional feature selection: Evaluation for genomic prediction in man. Scientific Reports, 5(1), 10312. https://doi.org/10.1038/srep10312.
Article Google Scholar
Bianchi, F., Nuciforo, P., Vecchi, M., Bernard, L., Tizzoni, L., Marchetti, A., et al. (2007). Survival prediction of stage I lung adenocarcinomas by expression of 10 genes. Journal of Clinical Investigation, 117(11), 3436–3444. https://doi.org/10.1172/JCI32007.
Article Google Scholar
Breiman, L. (2001). Documentation for R package randomForest. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324.
Article Google Scholar
Bundred, N. J. (2001). Prognostic and predictive factors in breast cancer. Cancer Treatment Reviews, 27(3), 137–142. https://doi.org/10.1053/ctrv.2000.0207.
Article Google Scholar
Cam, A., Chui, M., & Hall, B. (2018). Global AI Survey: AI proves its worth, but few scale impact. McKinsey.
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2011). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357. https://doi.org/10.1613/jair.953.
Article Google Scholar
Chow, C. K., & Liu, C. N. (1968). Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, 14(3), 462–467. https://doi.org/10.1109/TIT.1968.1054142.
Article Google Scholar
Cruz, J. A., & Wishart, D. S. (2006). Applications of machine learning in cancer prediction and prognosis. Cancer Informatics, 2, 117693510600200. https://doi.org/10.1177/117693510600200030.
Article Google Scholar
Cutler, A., Cutler, D. R., & Stevens, J. R. (2012). Random forests BT - ensemble machine learning: Methods and applications. In Ensemble machine learning (Vol. 45, pp. 157–175). https://doi.org/https://doi.org/10.1007/978-1-4419-9326-7_5.
Dag, A., Oztekin, A., Yucel, A., Bulur, S., & Megahed, F. M. (2017). Predicting heart transplantation outcomes through data analytics. Decision Support Systems, 94, 42–52. https://doi.org/10.1016/j.dss.2016.10.005.
Article Google Scholar
Dhanalakshmi, L., Ranjitha, S., & Suresh, H. N. (2016). A novel method for image processing using Particle Swarm Optimization technique. In 2016 International conference on electrical, electronics, and optimization techniques (ICEEOT) (pp. 3357–3363). IEEE. https://doi.org/https://doi.org/10.1109/ICEEOT.2016.7755326.
Fan, W., Liu, J., Zhu, S., et al. (2018). Investigating the impacting factors for the healthcare professionals to adopt artificial intelligence-based medical diagnosis support system (AIMDSS). Annals of Operations Research. https://doi.org/10.1007/s10479-018-2818-y.
Article Google Scholar
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/AOS/1013203451.
Article Google Scholar
Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian network classifiers. Machine Learning, 29(2), 131–163. https://doi.org/10.1023/A:1007465528199.
Article Google Scholar
Friedman, N., Geiger, D., Provan, G., Langley, P., & Smyth, P. (1997). Bayesian network classifiers * (Vol. 29). Kluwer Academic Publishers.
Fu, C., Liu, W., & Chang, W. (2018). Data-driven multiple criteria decision making for diagnosis of thyroid cancer. Annals of Operations Research. https://doi.org/10.1007/s10479-018-3093-7.
Article Google Scholar
Gupta, S., Tran, T., Luo, W., Phung, D., Kennedy, R. L., Broad, A., et al. (2014). Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry. MBJ Open, 4, 1–7. https://doi.org/10.1136/bmjopen-2013.
Article Google Scholar
Haykin, S. (2009). Neural networks and learning machines (3rd Editio.). London: Prentice Hall.
Heshmat, M., & Eltawil, A. (2019). Solving operational problems in outpatient chemotherapy clinics using mathematical programming and simulation. Annals of Operations Research. https://doi.org/10.1007/s10479-019-03500-y.
Article Google Scholar
Hopp, W. J., Li, J., & Wang, G. (2018). Big Data and the precision medicine revolution. Production and Operations Management, 27(9), 1647–1664. https://doi.org/10.1111/poms.12891.
Article Google Scholar
Hou, J., Aerts, J., den Hamer, B., van IJcken, W., den Bakker, M., Riegman, P., , et al. (2010). Gene expression-based classification of non-small cell lung carcinomas and survival prediction. PLoS ONE, 5(4), e10312. https://doi.org/10.1371/journal.pone.0010312.
Article Google Scholar
Iqbal, J., Ginsburg, O., Rochon, P. A., Sun, P., & Narod, S. A. (2015). Differences in breast cancer stage at diagnosis and cancer-specific survival by race and ethnicity in the United States. JAMA, 313(2), 165. https://doi.org/10.1001/jama.2014.17322.
Article Google Scholar
Islami, F., Miller, K. D., Siegel, R. L., Zheng, Z., Zhao, J., Han, X., et al. (2019). National and state estimates of lost earnings from cancer deaths in the United States. JAMA Oncology. https://doi.org/10.1001/jamaoncol.2019.1460.
Article Google Scholar
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 103). New York, NY: Springer. https://doi.org/10.1007/978-1-4614-7138-7.
Book Google Scholar
Jayasurya, K., Fung, G., Yu, S., Dehing-Oberije, C., De Ruysscher, D., Hope, A., et al. (2010). Comparison of Bayesian network and support vector machine models for two-year survival prediction in lung cancer patients treated with radiotherapy. Medical Physics, 37(4), 1401–1407. https://doi.org/10.1118/1.3352709.
Article Google Scholar
Kennedy, J. (2011). Particle Swarm Optimization. In Encyclopedia of machine learning (pp. 760–766). Boston, MA: Springer. https://doi.org/https://doi.org/10.1007/978-0-387-30164-8_630
Kocheturov, A., Pardalos, P. M., & Karakitsiou, A. (2019). Massive datasets and machine learning for computational biomedicine: trends and challenges. Annals of Operations Research, 276, 5–34. https://doi.org/10.1007/s10479-018-2891-2.
Article Google Scholar
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Appears in the international joint conference on artificial intelligence (IJCAI) (pp. 1–7). https://doi.org/https://doi.org/10.1067/mod.2000.109031.
Kohavi, R. (1996). Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. In Proceedings of the second international conference on knowledge discovery and data mining (pp. 202–207).
Kononenko, I. (2001). Machine learning for medical diagnosis: History, state of the art and perspective. Artificial Intelligence in Medicine, 23(1), 89–109. https://doi.org/10.1016/S0933-3657(01)00077-X.
Article Google Scholar
Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V., & Fotiadis, D. I. (2015). Machine learning applications in cancer prognosis and prediction. Computational and Structural Biotechnology Journal: Elsevier. https://doi.org/10.1016/j.csbj.2014.11.005.
Book Google Scholar
Kratz, J. R., He, J., Van Den Eeden, S. K., Zhu, Z. H., Gao, W., Pham, P. T., et al. (2012). A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: Development and international validation studies. The Lancet, 379(9818), 823–832. https://doi.org/10.1016/S0140-6736(11)61941-7.
Article Google Scholar
Lin, S.-W., Ying, K.-C., Chen, S.-C., & Lee, Z.-J. (2008). Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Systems with Applications, 35(4), 1817–1824. https://doi.org/10.1016/J.ESWA.2007.08.088.
Article Google Scholar
Malekpoor, H., Mishra, N., & Kumar, S. (2018). A novel TOPSIS-CBR goal programming approach to sustainable healthcare treatment. Annals of Operations Research. https://doi.org/10.1007/s10479-018-2992-y.
Article Google Scholar
Malik, M. M., Abdallah, S., & Ala’raj, M. (2018). Data mining and predictive analytics applications for the delivery of healthcare services: a systematic literature review. Annals of Operations Research, 270, 287–312. https://doi.org/10.1007/s10479-016-2393-z.
Article Google Scholar
National Cancer Institution. (2019a). Cancer Facts and Figures 2019. https://www.cancer.gov/types/common-cancers. Accessed November 18, 2019.
National Cancer Institution. (2019b). Financial Burden of Cancer Care | Cancer Trends Progress Report. https://progressreport.cancer.gov/after/economic_burden. Accessed November 18, 2019.
Olson, D. L., & Delen, D. (2008). Advanced data mining techniques. Springer Publishing Company, Incorporated. https://doi.org/10.1007/978-3-540-76917-0.
Article Google Scholar
Parr, T., Turgutlu, K., Csiszar, C., & Howard, J. (2018). Beware Default Random Forest Importances. https://explained.ai/rf-importance/index.html. Accessed 15 July 2020
Parvin, H., Goel, P., & Gautam, N. (2012). An analytic framework to develop policies for testing, prevention, and treatment of two-stage contagious diseases. Annals of Operations Research, 196, 707–735. https://doi.org/10.1007/s10479-012-1103-8.
Article Google Scholar
Pavel, P., Petr, S., & Stritecky, R. (2007). Methodology of selecting the most informative variables for decision-making problems of classification type. In Proc. of the 6th International Conference on Information and Management Sciences, (pp. 212–229). Lhasa, Tibet, China.
Pearl, J., & Judea. (1997). Probabilistic reasoning in intelligent systems : networks of plausible inference. Morgan Kaufmann Publishers.
Podolsky, M., Barchuk, A., Kuznetcov, V., Gusarova, N., Gaidukov, V., & Tarakanov, S. (2016). Evaluation of machine learning algorithm utilization for lung cancer classification based on gene expression levels. Asian Pacific Journal of Cancer Prevention, 17(2), 835–838.
Article Google Scholar
Powers, D. M. W. (2011). EVALUATION: FROM PRECISION, RECALL AND F-MEASURE TO ROC, INFORMEDNESS, MARKEDNESS & CORRELATION. Journal of Machine Learning Technologies, 2(1), 37–63. http://dspace.flinders.edu.au/dspace/http://www.bioinfo.in/contents.php?id=51. Accessed August 24, 2020.
Probst, P., & Bischl, B. (2019). Tunability: Importance of hyperparameters of machine learning algorithms. Journal of Machine Learning Research (Vol. 20). http://jmlr.org/papers/v20/18-444.html. Accessed July 23, 2020.
Quantin, C., Abrahamowicz, M., Moreau, T., Bartlett, G., Mackenzie, T., Tazi, M. A., et al. (1999). Variation over time of the effects of prognostic factors in a population-based study of colon cancer: Comparison of statistical models.
Ramos, C., Cataldo, A., & Ferrer, J. (2020). Appointment and patient scheduling in chemotherapy: A case study in Chilean hospitals. Annals of Operations Research, 286, 411–439. https://doi.org/10.1007/s10479-018-3085-7.
Article Google Scholar
Rampaul, R. S., Pinder, S. E., Elston, C. W., & Ellis, I. O. (2001). Prognostic and predictive factors in primary breast cancer and their role in patient management: The Nottingham breast team. European Journal of Surgical Oncology, 27(3), 229–238. https://doi.org/10.1053/ejso.2001.1114.
Article Google Scholar
Sava, M. G., Vargas, L. G., May, J. H., et al. (2019). An analysis of the sensitivity and stability of patients’ preferences can lead to more appropriate medical decisions. Annals of Operations Research. https://doi.org/10.1007/s10479-018-3109-3.
Article Google Scholar
Sesen, M. B., Kadir, T., Alcantara, R. B., Fox, J., & Brady, M. (2012). Survival prediction and treatment recommendation with Bayesian techniques in lung cancer. AMIA … Annual Symposium proceedings/AMIA Symposium. AMIA Symposium, 2012, 838–847.
Google Scholar
Siegel, R. L., Miller, K. D., & Jemal, A. (2018). Cancer statistics, 2018. CA: A Cancer Journal for Clinicians, 68(1), 7–30. https://doi.org/https://doi.org/10.3322/caac.21442
Sun, Z., Wigle, D., & Yang, P. (2008). Non-overlapping and non-cell-type-specific gene expression signatures predict lung cancer survival. Journal of Clinical Oncology, 26(6), 877–833. https://doi.org/10.1200/JCO.2007.13.1516.
Article Google Scholar
Tibben, W. J., Fosso Wamba, S., & Tibben, W. (2018). Exploring the potential of big data on the health care delivery Exploring the potential of big data on the health care delivery value chain (CDVC): a preliminary literature and research agenda value chain (CDVC): a preliminary literature and research agenda Exploring the potential of big data on the health care delivery value chain (CDVC): a preliminary literature and research agenda. Faculty of Engineering and Information Sciences - Papers: Part B., 2045–2054. https://ro.uow.edu.au/eispapers1/1277. Accessed 13 March 2020
Trelea, I. C. (2003). The particle swarm optimization algorithm: Convergence analysis and parameter selection. Information Processing Letters, 85(6), 317–325. https://doi.org/10.1016/S0020-0190(02)00447-7.
Article Google Scholar
Välk, K., Vooder, T., Kolde, R., Reintam, M.-A., Petzold, C., Vilo, J., & Metspalu, A. (2010). Gene expression profiles of non-small cell lung cancer: Survival prediction and new biomarkers. Oncology, 79(3–4), 283–292. https://doi.org/10.1159/000322116.
Article Google Scholar
Wang, L., Ni, H., Yang, R., Pappu, V., Fenn, M. B., & Pardalos, P. M. (2014). Feature selection based on meta-heuristics for biomedicine. Optimization Methods and Software, 29(4), 703–719. https://doi.org/10.1080/10556788.2013.834900.
Article Google Scholar
Wit, E., Heuvel, E. van den, & Romeijn, J.-W. (2012). ‘All models are wrong...’: An introduction to model uncertainty. Statistica Neerlandica, 66(3), 217–236. https://doi.org/https://doi.org/10.1111/j.1467-9574.2012.00530.x.
Yabroff, K. R., Lund, J., Kepka, D., & Mariotto, A. (2011). Economic burden of cancer in the United States: Estimates, projections, and future research. Cancer Epidemiology Biomarkers and Prevention. https://doi.org/https://doi.org/10.1158/1055-9965.EPI-11-0650
Yao, J., Wang, S., Zhu, X., & Huang, J. (2016). Imaging biomarker discovery for lung cancer survival prediction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9901 LNCS, pp. 649–657). Springer. https://doi.org/https://doi.org/10.1007/978-3-319-46723-8_75.
Zhang, H. (2004). The optimality of Naïve Bayes. In FLAIRS2004 conference.
Zhang, L. P., Yu, H. J., & Hu, S. X. (2005). Optimal choice of parameters for particle swarm optimization. Journal of Zhejiang University: Science, 6 A(6), 528–534. https://doi.org/https://doi.org/10.1631/jzus.2005.A0528
Zhou, M., Guo, M., He, D., Wang, X., Cui, Y., Yang, H., et al. (2015). A potential signature of eight long non-coding RNAs predicts survival in patients with non-small cell lung cancer. Journal of Translational Medicine, 13(1), 231. https://doi.org/10.1186/s12967-015-0556-3.
Article Google Scholar
Zhu, X., Yao, J., Luo, X., Xiao, G., Xie, Y., Gazdar, A., & Huang, J. (2016). Lung cancer survival prediction from pathological images and genetic data - An integration study. In Proceedings - International symposium on biomedical imaging (Vol. 2016-June, pp. 1173–1176). IEEE Computer Society. https://doi.org/https://doi.org/10.1109/ISBI.2016.7493475

Download references

Author information

Authors and Affiliations

Feliciano School of Business, Montclair State University, Montclair, NJ, USA
Marina Johnson, Abdullah Albizri & Serhat Simsek

Authors

Marina Johnson
View author publications
You can also search for this author in PubMed Google Scholar
Abdullah Albizri
View author publications
You can also search for this author in PubMed Google Scholar
Serhat Simsek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abdullah Albizri.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Johnson, M., Albizri, A. & Simsek, S. Artificial intelligence in healthcare operations to enhance treatment outcomes: a framework to predict lung cancer prognosis. Ann Oper Res 308, 275–305 (2022). https://doi.org/10.1007/s10479-020-03872-6

Download citation

Accepted: 08 November 2020
Published: 23 November 2020
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10479-020-03872-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial intelligence in healthcare operations to enhance treatment outcomes: a framework to predict lung cancer prognosis

Abstract

Access this article

Similar content being viewed by others

Machine Learning and Artificial Intelligence: Definitions, Applications, and Future Directions

Machine learning for risk stratification of thyroid cancer patients: a 15-year cohort study

Clinical applications of artificial intelligence and machine learning in cancer diagnosis: looking into the future

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Artificial intelligence in healthcare operations to enhance treatment outcomes: a framework to predict lung cancer prognosis

Abstract

Access this article

Similar content being viewed by others

Machine Learning and Artificial Intelligence: Definitions, Applications, and Future Directions

Machine learning for risk stratification of thyroid cancer patients: a 15-year cohort study

Clinical applications of artificial intelligence and machine learning in cancer diagnosis: looking into the future

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation