Elsevier

Computers in Biology and Medicine

Volume 47, 1 April 2014, Pages 147-160
Computers in Biology and Medicine

Modeling and predicting the occurrence of brain metastasis from lung cancer by Bayesian network: A case study of Taiwan

https://doi.org/10.1016/j.compbiomed.2014.02.002Get rights and content

Abstract

The Bayesian network (BN) is a promising method for modeling cancer metastasis under uncertainty. BN is graphically represented using bioinformatics variables and can be used to support an informative medical decision/observation by using probabilistic reasoning. In this study, we propose such a BN to describe and predict the occurrence of brain metastasis from lung cancer. A nationwide database containing more than 50,000 cases of cancer patients from 1996 to 2010 in Taiwan was used in this study. The BN topology for studying brain metastasis from lung cancer was rigorously examined by domain experts/doctors. We used three statistical measures, namely, the accuracy, sensitivity, and specificity, to evaluate the performances of the proposed BN model and to compare it with three competitive approaches, namely, naive Bayes (NB), logistic regression (LR) and support vector machine (SVM). Experimental results show that no significant differences are observed in accuracy or specificity among the four models, while the proposed BN outperforms the others in terms of sampled average sensitivity. Moreover the proposed BN has advantages compared with the other approaches in interpreting how brain metastasis develops from lung cancer. It is shown to be easily understood by physicians, to be efficient in modeling non-linear situations, capable of solving stochastic medical problems, and handling situations wherein information are missing in the context of the occurrence of brain metastasis from lung cancer.

Introduction

Twenty to forty percent of cancer patients develop brain metastases during their illness [39], [38]. Lung cancer is a leading cause of death worldwide and often spreads to the brain given that 65% of patients diagnosed with a primary tumor in their lungs will have brain metastases [7]. The survival time of patients is longer when lung cancer is detected early because it still localized and can be effectively treated. By contrast, survival time decreases and quality of life deteriorates once lung cancer metastasizes to the brain [13]. Modeling and predicting the development of brain metastasis from lung cancer become necessary in the early detection of brain metastasis. A model with good predictive ability can help physicians distinguish between lung cancer patients who will most likely develop brain metastasis, and those who will only suffer from lung cancer. Such information will help physicians decide on the most suitable treatment for lung cancer patients as well as determine appropriate management treatments to reduce or prevent brain metastasis. Therefore, the clinical outcome for these patients can be improved [14].

Previous studies have proposed a number of models to predict cancer outcomes. For example, Bajard et al. [3] used multivariate analysis to predict factors of brain metastases in a group of patients with stages I to III non-small cell lung cancer (NSCLC). The variables include age at the time of diagnosis, gender, performance status, weight-loss, stage, T-status, N-status, histological type, type of treatment, administration of chemotherapy, use of cisplatin, and response to initial treatment. A statistical method and conditional probability analysis were performed to analyze the metastatic patterns of lung cancer cases. Patient characteristics included in this study were age, gender, histology, lung cancer, and metastatic sites [37]. Hierarchical logistic regression (LR) was used to determine the predicted probability of metastatic disease to the brain as a function of age, sex, tumor size, cell type, locations of the tumors, and lymph node stage of the primary NSCLC patients [32]. Although traditional statistical and machine learning models, such as LR and support vector machine (SVM), are popularly used for cancer prediction, these models are not as promising as the Bayesian network (BN) given that the BN can use reasoning under uncertainty whereas both LR and SVM cannot.

The BN is a powerful tool for representing stochastic events and conducting prediction tasks. Lowd and Domingos [27] proposed a naive Bayes (NB) as an alternative model for the BN for general probability estimation task, the results showed that both NB and BN have the same computational time and accuracy performance; however, NB is unable to apply in relation domain because of its independency assumption while BN is widely used. Zhang et al. [49] concluded in their study that SVM and BN are the best two algorithms for predicting overweight and obesity from the Wirral database; however, [10] and Jayasurya et al. [20] concluded that BN is better to handle missing data than SVM; therefore, BN is more suitable for the medical domain. Oh et al. [36] stated that this tool can approximate complex multivariable probability distributions of heterogeneous variables as interpretable local probabilities to incorporate prior clinical and biological knowledge as well as to visualize and interpret the interactions among variables of interest for clinical use. Mancini et al. [29] stated that the traditional statistical methods are ineffective to describe the relationship between variables in biomedical domain because of their limitation of independency whereas the BN can overcome this limitation and become a popular method for analyzing in biomedical data. Lalande et al. [24] built a BN to identify older adults with high risk of falls; this tool would integrate numerous risk factors based on literature data, in order to obtain a fall risk assessment, giving robust results whatever the settings; in addition, the BN is interesting model because it concerns knowledge from experts and also knowledge contained in data. The BN can also be used as a classifier based on a learned network structure. As a result, each node can compute for the posterior probability distribution, which is useful for decision-makers.

Given the attractive characteristics of the BN model, researchers have used it in various medical problems. For example, BN is used in mammographic diagnosis for breast cancer [21]. Hoot and Aronsky [17] created a BN model that included 29 variables for predicting a 90-day graft survival. The predictive performance measured by area under the receiver operation characteristic curve was 0.674. Morales et al. [31] applied Bayesian classifiers to estimate the implantation probability of embryos in artificial insemination treatments from embryo images and to predict the successfully suitability implantation of an embryo chosen for being transferred. In perspective of the receiver operating characteristic, they concluded that the tree augmented naive Bayes, k-dependence Bayesian, and naive Bayes classifiers performed almost as well as the semi naive Bayes and selective naive Bayes classifiers. Visscher et al. [46] utilized BN for the diagnosis and treatment of ventilator-associated pneumonia. Oh et al. [36] developed a BN model to predict local failure in lung cancer. Corani et al. [9] presented a BN for predicting the outcome of in vitro fertilization (IVF); they concluded that BN is equally or more predictive than well-recognized classification algorithms with the further advantage of being biologically interpretable.

The remainder of this paper is organized as follows: in Section 2, we briefly review BN along with other methods used in this study. Variables, data, graphical model construction, including evaluation criteria, are described in Section 3. The experimental results are presented in Section 4. We provide the discussion and conclusions in Section 5.

Section snippets

Materials and methods

This section we thoroughly describe the proposed BN. In addition, the benchmark models: NB, LR, SVM are discussed as well as re-sampling techniques and model evaluation indexes used in this study.

Variables

Epidemiological studies have reported that age, gender, and residence are risk factors that may increase a person׳s chance to develop lung cancer [30], [22], [23], [42], [11]. Given that lung cancer metastasis occurs at the time of diagnosis or after undergoing treatment, treatment was also used as a factor for occurrence of brain metastasis [3], [19]. Accordingly, in the present study, six variables were used to construct the proposed BN model: (1) age; (2) gender; (3) region of residence,

Experiments and results

In this section, we discuss our experimental results in two parts: explanatory graphical model and inference and predictive performance.

Conclusion

The prognosis in patients with brain metastases from lung cancer was usually poor in the past [40], [41]. In this study, we resolved this issue by presenting a Bayesian network model. To our knowledge, this is the first time the Bayesian network is used to predict the occurrence of brain metastasis from lung cancer. In the present study, we identified six variables by a rigorous correlation analysis and constructed the Bayesian network model examined by domain experts/doctors, including: (1)

Conflict of interest statement

None declared.

Acknowledgments

The authors gratefully acknowledge the comments and suggestions of the editor and the anonymous referees. This work is partially supported by the National Science Council, the top-research-university project and the model-of-vocational-university project of Ministry of Education (Taiwan), and National Taiwan University of Science and Technology – Taipei Medical University Joint Research Program.

This study is based in part on data from the National Health Insurance Research Database provided by

Kung-Jeng Wang is a Professor of the department of Industrial Management of National Taiwan University of Science and Technology (Taiwan Tech). He received his PhD in industrial engineering from University of Wisconsin at Madison. Dr. Wang works closely with industries for research on manufacturing management and resource portfolio planning. He has published about 60 academic articles in international academic journals, such as in IEEE Transactions on Systems, Man, and Cybernetics, IIE

References (50)

  • L. Uusitalo

    Advantages and challenges of Bayesian networks in environmental modeling

    Ecol. Model.

    (2007)
  • S. Visscher et al.

    Modelling treatment effects in a clinical Bayesian network using Boolean threshold functions

    Artif. Intell. Med.

    (2009)
  • American Cancer Society

    Global Cancer Facts and Figures

    (2011)
  • T. Badriyah et al.

    Decision trees for predicting risk of mortality using routinely collected data

    Int. J. Soc. Hum. Sci.

    (2012)
  • S. Bozkurt, A. Uyar, Comparison of Bayesian network and binary logistic regression methods for prediction of prostate...
  • J.S. Brown et al.

    Age and the treatment of lung cancer

    Thorax

    (1996)
  • M.L. Cartman et al.

    Lung cancer: district active treatment rates affect survival

    J. Epidemiol. Community Health

    (2002)
  • A. Chi et al.

    Treatment of brain metastasis from lung cancer

    Cancers

    (2010)
  • A. Dekker, C. Dehing-Oberije, D. De uysscher, P. Lambin, A. Hope, K. Komati, G. Fung, Y.U. Shipeng, W. De Neve, Y....
  • L.F. Forrest et al.

    Socioeconomic inequalities in lung cancer treatment: systematic review and meta – analysis

    PLOS Med.

    (2013)
  • I.T. Gavrilovic et al.

    Brain metastases: epidemiology and pathophysiology

    J. Neuro-Oncol.

    (2005)
  • O. Graesslin

    Nomogram to predict subsequent brain metastasis in patients with metastatic breast cancer

    J. Clin. Oncol.

    (2010)
  • X. Guo et al.

    Support vector machine prediction model of early-stage lung cancer based on curvelet transform to extract texture features of CT image

    World Acad. Sci. Eng. Technol.

    (2010)
  • W. Hämäläinen, M. Vinni, Comparison of machine learning methods for intelligent tutoring systems, in: Proceedings of...
  • N. Hoot, D. Aronsky, Using Bayesian networks to predict survival of liver transplant patients, in: Proceedings of AMIA...
  • Cited by (0)

    Kung-Jeng Wang is a Professor of the department of Industrial Management of National Taiwan University of Science and Technology (Taiwan Tech). He received his PhD in industrial engineering from University of Wisconsin at Madison. Dr. Wang works closely with industries for research on manufacturing management and resource portfolio planning. He has published about 60 academic articles in international academic journals, such as in IEEE Transactions on Systems, Man, and Cybernetics, IIE Transactions, European Journal of Operational Research, International Journal of Production Research, and Journal of Robotics and CIM. His current research interests are in the areas of intelligent systems, biomedical informatics, and supply chain management.

    Bunjira Makond is PhD candidate in the Department of Industrial Management of National Taiwan University of Science and Technology. Her research interest is in the area of biomedical informatics.

    Kung-Min Wang is a surgery doctor in Shin-Kong Wu Ho-Su Memorial Hospital, Taiwan. His research interests are in the areas of biomedical informatics, clinical medicine and cancer treatment.

    View full text