Feature subset selection by genetic algorithms and estimation of distribution algorithms: A case study in the survival of cirrhotic patients treated with TIPS

doi:10.1016/S0933-3657(01)00085-9

Artificial Intelligence in Medicine

Volume 23, Issue 2, October 2001, Pages 187-205

https://doi.org/10.1016/S0933-3657(01)00085-9 Get rights and content

Abstract

The transjugular intrahepatic portosystemic shunt (TIPS) is an interventional treatment for cirrhotic patients with portal hypertension. In the light of our medical staff’s experience, the consequences of TIPS are not homogeneous for all the patients and a subgroup dies in the first 6 months after TIPS placement. Actually, there is no risk indicator to identify this subgroup of patients before treatment. An investigation for predicting the survival of cirrhotic patients treated with TIPS is carried out using a clinical database with 107 cases and 77 attributes. Four supervised machine learning classifiers are applied to discriminate between both subgroups of patients. The application of several feature subset selection (FSS) techniques has significantly improved the predictive accuracy of these classifiers and considerably reduced the amount of attributes in the classification models. Among FSS techniques, FSS–TREE, a new randomized algorithm inspired on the new EDA (estimation of distribution algorithm) paradigm has obtained the best average accuracy results for each classifier.

Introduction

Portal hypertension is a major complication of chronic liver disease. By definition, it is a pathological increase in the portal venous pressure which results in formation of porto-systemic collaterals that divert blood from the liver to the systemic circulation. This is caused by both an obstruction to outflow in the portal flow as well as an increased mesenteric flow. In the western world, cirrhosis of the liver accounts for approximately 90% of the patients.

Of the sequelae of portal hypertension (i.e. varices, encephalopathy, hypersplenism, ascites), bleeding from gastro-oesophageal varices is a significant cause of early mortality (approximately 30–50% at the first bleed) [4], [41].

Many efforts have been made over the past decades in the treatment of portal hypertension. This has resulted in an increasing number of randomized trials and publications but, unfortunately, therapeutic decision is not easy [10].

The transjugular intrahepatic portosystemic shunt (TIPS) is an interventional treatment resulting in decompression of the portal system by creation of a side-to-side portosystemic anastomosis. Since its introduction over 10 years ago [39], [40] and despite the large number of published studies, many questions remain unanswered. Currently, little is known about the effects of TIPS on the survival of the treated patients.

Our medical staff has found that a subgroup of patients dies in the first 6 months after a TIPS placement and the rest survive for longer periods. Actually there is no risk indicator to identify both subgroups of patients. We are equally interested in the detection of both subgroups, giving the same relevance to the reduction of both error types.

A period of 6 months is chosen as we think that beyond that period, factors such as stenosis of the shunt and possible variceal rebleeding as a consequence, would compound the analysis. Furthermore, a critical criteria for choosing this period is that the average waiting time on a list for a liver transplant at the University Clinic of Navarra is approximately 6 months. The only published study [31] to identify a subgroup of patients who die within a period after a TIPS placement fixes the length of this period to 3 months. However, we think that our specific conditions really suggest lengthening this period to 6 months.

Traditionally, Pugh’s modification of the Child–Turcotte score (referred to as the Child–Pugh score) has been used to assess risk in patients undergoing portosystemic shunt surgery [37]. Although it is a classic score to assess the level of seriousness of a patient’s liver disease, it has inherent problems when applied to patients undergoing TIPS and it cannot be used to predict which patients will die within a certain period of time and which patients will survive that period. The several difficulties and innacuracies in applying the Child–Pugh score to predict survival periods have been detailed by Conn [8].

In 1980s and 1990s, researchers in artificial intelligence have developed new machine learning methods that construct predictive models from data, obtaining promising results in several clinical areas [9], [18], [35]. As far we know, these kinds of techniques have never been applied to TIPS indication or contraindication. Thus, we assume that the prediction of patient survival within 6 months after elective TIPS is well-suited to supervised machine learning methods.

We have performed a prospective study, building up a database which includes a structured and standardized history and clinical examination. In the study reported here, we concentrate on the task of predicting survival within 6 months after a TIPS placement of hospitalized patients from their findings before a TIPS setting. For this purpose, in the first step, four well-known supervised machine learning methods with a long tradition on medical applications such as a Naive Bayes classifier, a decision-tree technique, a rule-learning procedure and a nearest neighbor method are applied.

However, the used database has a large set of measured findings¹ and some of them seem to be irrelevant or redundant. It is well known that the accuracy of supervised machine learning methods is not monotonic regarding the inclusion of features [26]: irrelevant or redundant attributes, depending on the specific characteristics of the classifier, may degrade the accuracy of the classification model. Ohmann et al. [35], in a problem of an acute abdominal pain diagnosis, note that the high dimensionality of their study database is the major problem in order to improve the predictive accuracy of their supervised classification models. In this sense, given the entire set of attributes, we aim to find the attribute subset with the best predictive accuracy for a certain classifier. This problem is known in the machine learning community as the feature subset selection (FSS) problem and it has been tackled with success in different medical areas [13], [21]. A reduction in the number of variables is of interest as classification models with a smaller number of variables may be more quickly and easily used by clinicians, as these models would require a lower data input [9]. Models with a relatively small number of variables may be more readily converted into paper-based models that could be used widely in current medical practice. Other interesting effects of the dimensionality reduction are the decrease in the cost of adquisition of the data and the rise in the interpretability and comprehensibility of the classification models.

Thus, in the second stage of the study, we apply two sequential and two genetic FSS techniques over the same survival predictive problem. We extend our comparison with the application of two new FSS procedures based on the new EDA (estimation of distribution algorithm) [33] paradigm.

Although the application of two new EDA-inspired FSS techniques refers to the specific medical problem, the proposed approach is general and can be used for other tasks where supervised machine learning algorithms face a high number of irrelevant and/or redundant features.

Costs of medical tests are not considered in the construction of classification models and predictive accuracy maximization is the principal goal of our research. As the cost of the TIPS placement is not insignificant, our study is developed to help physicians, counsel patients and their families before deciding to proceed with elective TIPS.

The paper is organized as follows. The study database is described in Section 2. The supervised classifiers included in the study and the FSS methods to improve their predictive accuracy are described in Section 3. Experimental results are presented in Section 4. The last section briefly summarizes the work and presents ways of future research in the field.

Section snippets

Patients: study database

The prospective study includes 127 patients with liver cirrhosis who underwent TIPS from May 1991 to September 1998 in the University Clinic of Navarra, Spain. The diagnosis of cirrhosis was based in liver histology in all cases.

The indications for TIPS placement were: prophylaxis of rebleeding (68 patients); refractory ascites (28 patients); prophylaxis of bleeding (11 patients); acute bleeding refractory to endoscopic and medical therapy (10 patients); portal vein thrombosis (9 patients) and

Supervised classifiers

In the study, four well-known machine learning supervised classifiers, with completely different approaches to learning, were applied to predict the survival of cirrhotic patients for the first 6 months after the setting of the TIPS. All the algorithms were selected due to their simplicity and their long standing tradition in medical diagnose studies.

The Naive–Bayes (NB) rule [5] uses the Bayes theorem to predict the category for each case, assuming that the attributes are independent given the

Experiments

SFS and SBE are deterministic algorithms which are only run once for each classifier. Due to their randomized nature, GA-o, GA-u, FSS–PBIL and FSS–TREE are run 10 times for each classifier. Coupled with the leave-one-out estimation of the predictive accuracy of four classifiers without feature selection and SFS and SBE selection methods, Table 2 also reflects the leave-one-out accuracy estimation of the best run of each randomized FSS method. Apart from the standard deviation of the

Summary and future work

A medical problem, the prediction of the survival of cirrhotic patients treated with TIPS, has been focused from a machine learning perspective, with the aim of obtaining a classification rule for the indication or contraindication of TIPS in cirrhotic patients. With the application of several feature selection techniques the predictive accuracy of applied classifiers is largely improved. Among feature selection techniques, FSS–TREE, a new randomized algorithm inspired on the new EDA paradigm,

Acknowledgements

This work was supported by the PI 96/12 grant from Gobierno Vasco, Departamento de Educación, Universidades e Investigación and the grant UPV 140.226-EB131/99 from University of the Basque Country.

References (43)

P.C. Bornman et al.
Management of oesophageal varices
Lancet
(1994)
G.F. Cooper et al.
An evaluation of machine-learning methods for predicting pneumonia mortality
Artif. Intell. Med.
(1997)
G. D’Amico et al.
The treatment of portal hypertension: a meta-analytic review
Hepatology
(1995)
J. Jelonek et al.
Feature subset selection for classification of histological images
Artif. Intell. Med.
(1997)
R. Kohavi et al.
Wrappers for feature subset selection
Artif. Intell.
(1997)
M. Kudo et al.
Comparison of algorithms that select features for pattern classifiers
Pattern Recogn.
(2000)
D. Michie
Personal models of rationality
J. Statist. Plann. Inference
(1990)
C. Ohmann et al.
Evaluation of automatic knowledge acquisition techniques in the diagnosis of acute abdominal pain
Artif. Intell. Med.
(1996)
M. Róssle et al.
New operative treatment for variceal haemorrhage
Lancet
(1989)
D.W. Aha et al.
Instance-based learning algorithms
Machine Learning
(1991)

Bäck T. Evolutionary algorithms is theory and practice. Oxford: Oxford University Press,...

Baluja S. Population-based incremental learning: a method for integrating genetic search based function optimization...

Cestnik B. Estimating probabilities: a crucial task in Machine Learning. In: Proceedings of ECAI-90, 1990....

C. Chow et al.

Approximating discrete probability distributions with dependence trees

IEEE Trans. Inform. Theory

(1968)

P. Clark et al.

The CN2 induction algorithm

Machine Learning

(1989)

H.O. Conn

A peek at the Child–Turcotte classification

Hepatology

(1981)

T.G. Diettrich

Approximate statistical tests for comparing supervised learning algorithms

Neural Comput.

(1998)

Doak J. An Evaluation of feature selection methods and their application to computer security. Technical Report...

D. Draper et al.

A case study of stochastic optimization in health policy: problem formulation and preliminary results

J Global Opt

(2000)

Etxeberria R, Larrañaga P. Global optimization with Bayesian networks. In: Proceedings of the II Symposium on...

Friedman N, Yakhini Z. On the sample complexity of learning Bayesian networks. In: Proceedings of the Twelveth...

Cited by (45)

The journey to broad adoption
2023, Clinical Decision Support and beyond: Progress and Opportunities in Knowledge-Enhanced Health and Healthcare
In this chapter, we describe the challenges of incorporating clinical decision support (CDS) into operational environments. Effective CDS tools occur at the right time, are presented to a user who is able or willing to carry out the decisions supported by the CDS, is presented in an optimal format, is sufficiently patient-specific, and interfaces optimally with the clinical workflow. Much of this is dependent on the technology available, which has evolved over the decades. This chapter discusses how available technology, data sources, regulations, and the organization of healthcare delivery have evolved overtime, serving both as enablers and barriers for effective CDS.
Building performance evaluation through a novel feature selection algorithm for automated arx model identification procedures
2017, Energy and Buildings
ARX models are an effective instrument to evaluate continuous building performance from insufficient monitoring data. However, selecting the right model features is NP-hard. The problem of finding a minimal subset of informative inputs has been studied extensively in various fields but automatic, fast, and reliable procedures for finding optimal models for building performance evaluation are still missing. We propose a novel feature selection algorithm named Greedy Correlation Screening (GCS), which identifies a possible solution at a time by greedily maximizing the correlation between inputs and output and minimizing cross-correlations between inputs. These two objectives are competing, thus leading to best tradeoffs. Among these, the best model is automatically selected by applying filters and quality criteria such as the adjusted coefficient of correlation and non-correlation of residuals.
The performance of the proposed heuristic method is compared to two of the best algorithms used in the field, such as GRASP for feature selection and NSGA-II (Non-dominated Sorting Genetic Algorithm). The application on a real case study demonstrates that the proposed method solves the problem of feature selection in building performance estimation efficiently and reliably. Moreover, the model creation is automatic, making it ideal for integration into a Building Management System (BMS) in order to detect faults and perform short-term predictive control.
Not all PBILs are the same: Unveiling the different learning mechanisms of PBIL variants
2017, Applied Soft Computing Journal
Citation Excerpt :
Usually, in these works, the authors have proposed PBIL algorithms with some enhancements for solving the target problem, achieving competitive results (e.g., self-adaptive approach [12], multiple-population PBIL [19], hybrid approaches [15], different learning and sampling procedures to guarantee a higher diversity [44], parallel schemes [18]). Some of the findings from the PBIL applications have been: (i) PBIL improves the behavior of simple genetic algorithms [45]; (ii) PBIL algorithms outperform conventional GA approaches [25]; (iii) PBIL can be outperformed by EDAs that use more complex models or tuned search strategies [1,8,23]. Moreover, theoretical analyses of PBIL have been conducted.
Model-based optimization using probabilistic modeling of the search space is one of the areas where research on evolutionary algorithms (EAs) has considerably advanced in recent years. The population-based incremental algorithm (PBIL) is one of the first algorithms of its kind and it has been extensively applied to many optimization problems. In this paper we show that the different applications of PBIL reported in the literature correspond, in fact, to two essentially different algorithms, which are defined by the way the learning step is implemented. We analytically and empirically study the impact of the learning method on the search behavior of the algorithm. As a result of our research, we show examples in which the choice of a PBIL variant can produce qualitatively different outputs of the search process.
Definition, Scope, and Challenges
2014, Clinical Decision Support: The Road to Broad Adoption: Second Edition
A review on evolutionary algorithms in Bayesian network learning and inference tasks
2013, Information Sciences
Thanks to their inherent properties, probabilistic graphical models are one of the prime candidates for machine learning and decision making tasks especially in uncertain domains. Their capabilities, like representation, inference and learning, if used effectively, can greatly help to build intelligent systems that are able to act accordingly in different problem domains. Bayesian networks are one of the most widely used class of these models. Some of the inference and learning tasks in Bayesian networks involve complex optimization problems that require the use of meta-heuristic algorithms. Evolutionary algorithms, as successful problem solvers, are promising candidates for this purpose. This paper reviews the application of evolutionary algorithms for solving some NP-hard optimization tasks in Bayesian network inference and learning.
Data mining for quality control: Burr detection in the drilling process
2011, Computers and Industrial Engineering
Citation Excerpt :
Langley and Simon (1995) offers a brief description of some of these applications such as diagnosis of mechanical devices, preventing breakdowns in electrical transformers, forecasting severe thunderstorms, predicting the structure of proteins or making credit decisions. However, these algorithms are widely used increasingly in medicine as demonstrated in Inza, Merino, et al. (2001), bioinformatics as explained in Inza et al. (2010) and industrial applications as shown in Nieves et al. (2009), Santos, Nieves, Penya, and Bringas (2009), Correa, Bielza, de Ramirez, and Alique (2008) and Correa, Bielza, and Pamies-Teixeira (2009). It mixes mathematical elements with statistics and computational sciences such as classification trees, induction rules, neural networks, Bayesian networks, regression algorithms, supported vector machines, and clustering.
Drilling process is one of the most important operations in aeronautic industry. It is performed on the wings of the aeroplanes and its main problem lies with the burr generation. At present moment, there is a visual inspection and manual burr elimination task subsequent to the drilling and previous to the riveting to ensure the quality of the product. These operations increase the cost and the resources required during the process. The article shows the use of data mining techniques to obtain a reliable model to detect the generation of burr during high speed drilling in dry conditions on aluminium Al 7075-T6. It makes possible to eliminate the unproductive operations in order to optimize the process and reduce economic cost. Furthermore, this model should be able to be implemented later in a monitoring system to detect automatically and on-line when the generated burr is out of tolerance limits or not. The article explains the whole process of data analysis from the data preparation to the evaluation and selection of the final model.

View all citing articles on Scopus

View full text

Feature subset selection by genetic algorithms and estimation of distribution algorithms: A case study in the survival of cirrhotic patients treated with TIPS

Abstract

Introduction

Section snippets

Patients: study database

Supervised classifiers

Experiments

Summary and future work

Acknowledgements

Lancet

Artif. Intell. Med.

Hepatology

Artif. Intell. Med.

Artif. Intell.

Pattern Recogn.

J. Statist. Plann. Inference

Artif. Intell. Med.

Lancet

Instance-based learning algorithms

Machine Learning

Approximating discrete probability distributions with dependence trees

IEEE Trans. Inform. Theory

The CN2 induction algorithm

Machine Learning

A peek at the Child–Turcotte classification

Hepatology

Approximate statistical tests for comparing supervised learning algorithms

Neural Comput.

A case study of stochastic optimization in health policy: problem formulation and preliminary results

J Global Opt