Abstract
Gastric Cancer is the most common malignancy of the digestive tract, which is the third leading cause of cancer-related mortality worldwide. The early prognosis methods, especially Machine Learning (ML)-based approaches are one of the main strategies against GC, which have become a necessity to identify and prognosticate the factors that affect the GC. They enable the specialists to accelerate the subsequent clinical management of patients, who suffer from the GC. This paper aims at creating an Ensemble Method inspired from ML to predict the most significant factors of the GC occurrence. The main objective of this research is to predict the probabilities of the GC occurrence and its associated deaths. To achieve this goal, the created EM benefits from some ML-based methods, including Least Absolute Shrinkage and Selection Operator (LASSO)/Ridge Regression, Elastic Net, Logistic Regression (LR), Random Forest (RF), Gradient Boosting Decision Trees (GBDTs), and Deep Neural Network (DNN). The purpose of the provided EM is to lessen the prediction errors for the large number of the patients’ features. The main novelties of this research include: (i) A sequential EM created by a Stacking method to predict the probability of the GC and associated deaths; (ii) Benefiting from the significance level to make an accurate prediction; (iii) Employing two Chi-square tests to select the influent features; (iv) Tuning the parameters of the applied ML models to avoid over-fitting and intensifying the errors; (v) Applying different kinds of the regression methods to treat hyper-dimension cases; (vi) A new model for weighting the applied ML models. The outcomes of the implementation of the created EM in seven pioneer hospitals in the field of GC show that the designed EM generates more precise predictions with an accuracy of 97.9% and 76.3% to predict the GC and its associated deaths, respectively. Moreover, the obtained results from the Area Under Curve (AUC) validates and confirms the capability of the created EM to predict the probability of the GC and its related deaths with an accuracy of 98% and 90% to predict the GC and its associated deaths, respectively.
Similar content being viewed by others
Notes
OLGA and OLGIM are grading and staging standards, which depend on the histopathology findings of gastroscopic biopsy sampling of cancer.
The CCI is a method of categorizing comorbidities of patients based on the International Classification of Diseases (ICD).
References
Ajani, J. A., D’Amico, T. A., Almhanna, K., Bentrem, D. J., Chao, J., Das, P., & Sundar, H. (2016). Gastric cancer, version 32016, NCCN clinical practice guidelines in oncology. Journal of the National Comprehensive Cancer Network, 14(10), 1286–1312.
Akcay, M., Etiz, D., & Celik, O. (2020). Prediction of survival and recurrence patterns by machine learning in gastric cancer cases undergoing radiation therapy and chemotherapy. Advances in Radiation Oncology, 5(6), 1179–1187.
Amjadian, A., & Gharaei, A. (2021). An integrated reliable five-level closed-loop supply chain with multi-stage products under quality control and Green policies: Generalised outer approximation with exact penalty. International Journal of Systems Science: Operations & Logistics, 1–21.
Arai, J., Aoki, T., Sato, M., Niikura, R., Suzuki, N., Ishibashi, R., & Fujishiro, M. (2022). Machine learning–based personalized prediction of gastric cancer incidence using the endoscopic and histologic findings at the initial endoscopy. Gastrointestinal Endoscopy, 95(5), 864–872.
Askari, R., Sebt, M. V., & Amjadian, A. (2020). A multi-product EPQ model for defective production and inspection with single machine, and operational constraints: Stochastic programming approach. International conference on logistics and supply chain management (pp. 161–193). Springer.
Beraha, M., Metelli, A. M., Papini, M., Tirinzoni, A., & Restelli, M. (2019). Feature selection via mutual information: New theoretical insights. 2019 international joint conference on neural networks (IJCNN) (pp. 1–9). IEEE.
Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R. L., Torre, L. A., & Jemal, A. (2018). Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA a Cancer Journal for Clinicians, 68(6), 394–424.
Chen, M., & Decary, M. (2020). Artificial intelligence in healthcare: An essential guide for health leaders. Healthcare management forum (pp. 10–18). SAGE Publications.
DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, 837–845.
El-Manzalawy, Y., Hsieh, T. Y., Shivakumar, M., Kim, D., & Honavar, V. (2018). Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data. BMC Medical Genomics, 11(3), 19–31.
Figueroa, R. L., Zeng-Treitler, Q., Kandula, S., & Ngo, L. H. (2012). Predicting sample size required for classification performance. BMC Medical Informatics and Decision Making, 12(1), 1–10.
Gao, Y., Wang, H., Guo, M., & Li, Y. (2020). An adaptive machine learning pipeline for predicting the recurrence of gastric cancer. 2020 5th international conference on information science, computer technology and transportation (ISCTT) (pp. 408–411). IEEE.
Gao, C., Sun, H., Wang, T., Tang, M., Bohnen, N. I., Müller, M. L., & Dinov, I. D. (2018). Model-based and model-free machine learning techniques for diagnostic prediction and classification of clinical outcomes in Parkinson’s disease. Scientific Reports, 8(1), 1–21.
Gharaei, A., Karimi, M., & Hoseini Shekarabi, S. A. (2021c). Vendor-managed inventory for joint replenishment planning in the integrated qualitative supply chains: generalised benders decomposition under separability approach. International Journal of Systems Science: Operations & Logistics, 1–15.
Gharaei, A., Amjadian, A., & Shavandi, A. (2021a). An integrated reliable four-level supply chain with multi-stage products under shortage and stochastic constraints. International Journal of Systems Science: Operations & Logistics, 1–22.
Gharaei, A., Hoseini Shekarabi, S. A., & Karimi, M. (2021b). Optimal lot-sizing of an integrated EPQ model with partial backorders and re-workable products: An outer approximation. International Journal of Systems Science: Operations & Logistics, 1–17.
Gharaei, A., Amjadian, A., Amjadian, A., Shavandi, A., Hashemi, A., Taher, M., & Mohamadi, N. (2022). An integrated lot-sizing policy for the inventory management of constrained multi-level supply chains: null-space method. International Journal of Systems Science: Operations & Logistics, 1–14.
Gharaei, A., & Almehdawe, E. (2021). Optimal sustainable order quantities for growing items. Journal of Cleaner Production, 307, 127216.
Gong, C., Zhou, M., Hu, Y., Ren, Z., Ren, J., & Yao, M. (2022). Elastic net-based identification of GAMT as potential diagnostic marker for early-stage gastric cancer. Biochemical and Biophysical Research Communications, 591, 7–12.
Hayward, J., Alvarez, S. A., Ruiz, C., Sullivan, M., Tseng, J., & Whalen, G. (2010). Machine learning of clinical performance in a pancreatic cancer database. Artificial Intelligence in Medicine, 49(3), 187–195.
Hinton, D. J., Vázquez, M. S., Geske, J. R., Hitschfeld, M. J., Ho, A., Karpyak, V. M., & Choi, D. S. (2017). Metabolomics biomarkers to predict acamprosate treatment response in alcohol-dependent subjects. Scientific Reports, 7(1), 1–8.
Hosseinnataj, A., RezaBaneshi, M., & Bahrampour, A. (2020). Mortality risk factors in patients with gastric cancer using Bayesian and ordinary Lasso logistic models: A study in the Southeast of Iran. Gastroenterology and Hepatology from Bed to Bench, 13(1), 31.
Hu, Y., Zhao, L., Li, Z., Dong, X., Xu, T., & Zhao, Y. (2022). Classifying the multi-omics data of gastric cancer using a deep feature selection method. Expert Systems with Applications, 200, 116813.
Huang, R. J., Kwon, N. S. E., Tomizawa, Y., Choi, A. Y., Hernandez-Boussard, T., & Hwang, J. H. (2022). A comparison of logistic regression against machine learning algorithms for gastric cancer risk prediction within real-world clinical data streams. JCO Clinical Cancer Informatics., 6, 1015–1023.
Li, C., Shi, C., Zhang, H., Chen, Y., & Zhang, S. (2015). Multiple instance learning for computer aided detection and diagnosis of gastric cancer with dual-energy CT imaging. Journal of Biomedical Informatics, 57, 358–368.
Liu, R., Zhang, G., & Yang, Z. (2019). Towards rapid prediction of drug-resistant cancer cell phenotypes: Single cell mass spectrometry combined with machine learning. Chemical Communications, 55(5), 616–619.
Maeta, K., Nishiyama, Y., Fujibayashi, K., Gunji, T., Sasabe, N., Iijima, K., & Naito, T. (2018). Prediction of glucose metabolism disorder risk using a machine learning algorithm: Pilot study. JMIR Diabetes, 3(4), e10212.
McHugh, M. L. (2013). The chi-square test of independence. Biochemia Medica, 23(2), 143–149.
Meier, A., Nekolla, K., Earle, S., Hewitt, L., Aoyama, T., Yoshikawa, T., & Grabsch, H. I. (2018). End-to-end learning to predict survival in patients with gastric cancer using convolutional neural networks. Annals of Oncology, 29, viii23.
Moghimi-Dehkordi, B., Safaee, A., & Tabei, S. Z. (2009). A comparison between Cox proportional hazard models and logistic regression on prognostic factors in gastric cancer. East African Journal of Public Health.
Mohindru, G., Mondal, K., Dutta, P., & Banka, H. (2022). Mining challenges in large-scale IoT data framework–a machine learning perspective. Advanced Data Mining Tools and Methods for Social Computing, 239–259.
Nishio, M., Nishizawa, M., Sugiyama, O., Kojima, R., Yakami, M., Kuroda, T., & Togashi, K. (2018). Computer-aided diagnosis of lung nodule using gradient tree boosting and Bayesian optimization. PLoS ONE, 13(4), e0195875.
Pouliakis, A., Foukas, P., Triantafyllou, K., Margari, N., Karakitsou, E., Damaskou, V., & Tzivras, M. (2020). Machine learning for gastric cancer detection: A logistic regression approach. International Journal of Reliable and Quality E-Healthcare (IJRQEH), 9(2), 48–58.
Qiao, Z., Sun, N., Li, X., Xia, E., Zhao, S., & Qin, Y. (2018). Using machine learning approaches for emergency room visit prediction based on electronic health record data. Building continents of knowledge in Oceans of data: The future of co-created eHealth (pp. 111–115). IOS Press.
Ranstam, J., & Cook, J. A. (2018). LASSO regression. Journal of British Surgery, 105(10), 1348–1348.
Shakeel, P. M., Tolba, A., Al-Makhadmeh, Z., & Jaber, M. M. (2020). Automatic detection of lung cancer from biomedical data set using discrete AdaBoost optimized ensemble learning generalized neural networks. Neural Computing and Applications, 32(3), 777–790.
Sharma, H., Zerbe, N., Klempert, I., Hellwich, O., & Hufnagl, P. (2017). Deep convolutional neural networks for automatic classification of gastric carcinoma using whole slide images in digital histopathology. Computerized Medical Imaging and Graphics, 61, 2–13.
Songun, I., Putter, H., Kranenbarg, E. M. K., Sasako, M., & van de Velde, C. J. (2010). Surgical treatment of gastric cancer: 15-year follow-up results of the randomised nationwide Dutch D1D2 trial. The Lancet Oncology, 11(5), 439–449.
Souza, R. L. C., Ghasemi, A., Saif, A., & Gharaei, A. (2022). Robust job-shop scheduling under deterministic and stochastic unavailability constraints due to preventive and corrective maintenance. Computers & Industrial Engineering, 168, 108130.
Taleizadeh, A. A., Noori-daryan, M., Soltani, M. R., & Askari, R. (2021). Optimal pricing and ordering digital goods under piracy using game theory. Annals of Operations Research, 1–38.
Taleizadeh, A. A., Safaei, A. Z., Bhattacharya, A., & Amjadian, A. (2022b). Online peer-to-peer lending platform and supply chain finance decisions and strategies. Annals of Operations Research, 1–31.
Taleizadeh, A. A., Askari, R., & Konstantaras, I. (2022). An optimization model for a manufacturing-inventory system with rework process based on failure severity under multiple constraints. Neural Computing and Applications, 34(6), 4221–4264.
Taninaga, J., Nishiyama, Y., Fujibayashi, K., Gunji, T., Sasabe, N., Iijima, K., & Naito, T. (2019). Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study. Scientific Reports, 9(1), 1–9.
Van Der Sommen, F., Zinger, S., Schoon, E. J., & De With, P. H. (2014). Supportive automatic annotation of early esophageal cancer using local gabor and color features. Neurocomputing, 144, 92–106.
Vergara, J. R., & Estévez, P. A. (2014). A review of feature selection methods based on mutual information. Neural Computing and Applications, 24(1), 175–186.
Wang, G., & Qiao, J. (2021). An efficient self-organizing deep fuzzy neural network for nonlinear system modeling. IEEE Transactions on Fuzzy Systems.
Wang, G., Jia, Q. S., Zhou, M., Bi, J., Qiao, J., & Abusorrah, A. (2021). Artificial neural networks for water quality soft-sensing in wastewater treatment: a review. Artificial Intelligence Review, 1–23.
Wang, G., Bi, J., Jia, Q. S., Qiao, J., & Wang, L. (2022). Event-Driven Model Predictive Control with Deep Learning for Wastewater Treatment Process. IEEE Transactions on Industrial Informatics.
Wang, H., Zhou, X. B., Zhou, Y. B., Niu, Z. J., Chen, D., Wang, D. S., & Li, Y. (2008). Multivariate logistic regression analysis of postoperative severe complications and discriminant model establishment in gastric cancer post gastrectomy. Chinese Journal of Surgery, 46(24), 1902–1905.
Warkentin, M., Sugumaran, V., & Sainsbury, R. (2012). The role of intelligent agents and data mining in electronic partnership management. Expert Systems with Applications, 39(18), 13277–13288.
Wilke, H., Muro, K., Van Cutsem, E., Oh, S. C., Bodoky, G., Shimada, Y., & Ohtsu, A. (2014). Ramucirumab plus paclitaxel versus placebo plus paclitaxel in patients with previously treated advanced gastric or gastro-oesophageal junction adenocarcinoma (RAINBOW): A double-blind, randomised phase 3 trial. The Lancet Oncology, 15(11), 1224–1235.
Wu, X., Tang, H., Guan, A., Sun, F., Wang, H., & Shu, J. (2016). Finding gastric cancer related genes and clinical biomarkers for detection based on gene–gene interaction network. Mathematical Biosciences, 276, 1–7.
Xiao, Y., Wu, J., Lin, Z., & Zhao, X. (2018). A deep learning-based multi-model ensemble method for cancer prediction. Computer Methods and Programs in Biomedicine, 153, 1–9.
Yu, K. H., Beam, A. L., & Kohane, I. S. (2018). Artificial intelligence in healthcare. Nature Biomedical Engineering, 2(10), 719–731.
Zhang, Z., He, T., Huang, L., Li, J., & Wang, P. (2021). Immune gene prognostic signature for disease free survival of gastric cancer: Translational research of an artificial intelligence survival predictive system. Computational and Structural Biotechnology Journal, 19, 2329–2346.
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (statistical Methodology), 67(2), 301–320.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Baradaran Rezaei, H., Amjadian, A., Sebt, M.V. et al. An ensemble method of the machine learning to prognosticate the gastric cancer. Ann Oper Res 328, 151–192 (2023). https://doi.org/10.1007/s10479-022-04964-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-022-04964-1