Machine learning for survival analysis: a case study on recurrence of prostate cancer

https://doi.org/10.1016/S0933-3657(00)00053-1Get rights and content

Abstract

Machine learning techniques have recently received considerable attention, especially when used for the construction of prediction models from data. Despite their potential advantages over standard statistical methods, like their ability to model non-linear relationships and construct symbolic and interpretable models, their applications to survival analysis are at best rare, primarily because of the difficulty to appropriately handle censored data. In this paper we propose a schema that enables the use of classification methods — including machine learning classifiers — for survival analysis. To appropriately consider the follow-up time and censoring, we propose a technique that, for the patients for which the event did not occur and have short follow-up times, estimates their probability of event and assigns them a distribution of outcome accordingly. Since most machine learning techniques do not deal with outcome distributions, the schema is implemented using weighted examples. To show the utility of the proposed technique, we investigate a particular problem of building prognostic models for prostate cancer recurrence, where the sole prediction of the probability of event (and not its probability dependency on time) is of interest. A case study on preoperative and postoperative prostate cancer recurrence prediction shows that by incorporating this weighting technique the machine learning tools stand beside modern statistical methods and may, by inducing symbolic recurrence models, provide further insight to relationships within the modeled data.

Introduction

Among prognostic modeling techniques that induce models from medical data, the survival analysis methods are specific both in terms of modeling and the type of data required. The survival data normally include the censor variable, which indicates whether some outcome under observation (like death or recurrence of a disease) has occurred within some patient-specific follow-up time. The modeling technique must then consider that for some patients the follow-up may end before the event occurs. In other words, it must take into account that for patients for whom the event has not occurred during the follow-up period, the event may eventually occur.

Typically, given the patient’s data, survival models attempt to determine the probability of the event to occur within a specific time. Frequently, however, there are cases in survival analysis where the prediction of whether the event will eventually occur or not is of primary importance. For example, for the urologist deciding whether to operate on patients with clinically localized prostate cancer, the probability of cancer recurrence is a very important decision factor. In such cases, the survival analysis requires purely classification models that classify either to the occurrence or to the non-occurrence of event and optionally model the outcome probabilities, and appropriately consider the censoring.

Recently, the machine learning community has developed various tools that have been successfully used in the construction of classification models, including medical prognostic models [15], [18]. In this paper, we propose a framework which allows us to use machine learning techniques to construct classification models from survival data. To properly address censoring in the training data, patients for whom the event did not occur and have short follow-up time require special treatment. Note that for them the final outcome is not known with certainty. Trivial solutions to this problem by their removal from the data set or considering them as examples where the event will not occur would bias the modeling [22], [12] and should thus be avoided. To properly treat such cases, we propose a technique that assigns a distribution of outcomes instead of a single outcome. The distribution is assessed through the outcome probability estimate based on the Kaplan–Meier method. Since most machine learning techniques do not deal with outcome distributions, the schema is implemented using weighted examples. Although developed independently, the proposed technique is similar to the one used by Ripley and Ripley [22]. The main difference, however, is that they use data weighting only when testing the models, whereas for their construction different approaches are used.

The benefits of the proposed framework stem from the potential advantages of machine learning methods. Symbolic induction techniques can help us to understand underlying relationships in the prostate cancer data. Some machine learning techniques can discover and use non-linearities and variable interactions [12], thus overcoming the limitations of linear statistical predictors.

We investigate the applicability of the proposed framework to the problem of modeling prostate cancer survival data and use two different machine learning methods. While any machine learning method that induces models from weighted examples may be used, a naive Bayes classifier and induction of decision trees were selected for our study because of their simplicity, acceptance and generally good performance. The two were compared to the Cox proportional hazards model [6], which is a standard statistical survival analysis technique for prediction based on multiple variables.

We use two separate datasets to construct the prostate cancer survival models. The preoperative data set includes data on tests that were administered prior to the prostatectomy (prostate removal), while postoperative dataset also includes data from several routinely performed pathologic tests. Preoperative data are generally fully known at least 2 weeks prior to the operation, while postoperative data generally are complete approximately a month following the operation. Clinically, both prediction models would be very useful. A model based on preoperative data could be used for patient decision making as to whether the ability of prostatectomy is worth the potential treatment complications (impotence and incontinence). If the predicted probability of recurrence were high, patients might choose one of several other treatments which did not have the adverse effects of such an aggressive therapy. Postoperatively, a prediction model is also useful, but for different purposes. If recurrence could be predicted postoperatively, prior to actual recurrence, a second therapy (after the prostatectomy) could be administered quickly, when it is potentially the most effective. Thus, preoperative and postoperative prediction models are both very useful but for different purposes: deciding whether to undergo prostatectomy at all, and then whether to add additional treatment.

In Section 2 we begin by describing two prostate cancer datasets used in our experimental evaluation. The proposed treatment of censored data that uses outcome distributions (data weighting) is described in Section 3, together with a description of machine learning techniques, experimental design and statistics that were used to compare the performance of resulting models. Section 4 presents the experimental results and discusses the differences and advantages of selected prediction methods. An overview of related work is given in Section 5. Section 6 summarizes the results and concludes the paper.

Section snippets

Patient data

Two prostate cancer datasets were used in this study. They both include patients that were treated with radical prostatectomy, and were followed-up to observe the recurrence of the cancer. While the first dataset includes only preoperative data, the second dataset additionally incorporates data gathered postoperatively. The task in both cases was to construct a model that would, given the corresponding patient’s data, predict the probability of recurrence.

Methods

The naive Bayes classifier and the induction of decision tree machine learning methods were used and evaluated. Their performance was compared to a Cox proportional hazards model on the basis of the classification accuracy, specificity and sensitivity, correlation of predicted probability and probability estimated by the Kaplan–Meier method, and concordance index (area under receiver operating characteristic curve). We first explain how we treat censored data, then briefly introduce the machine

Results and discussion

For preoperative data, Table 3 shows the results when applying different modeling techniques. Overall, the naive Bayes and Cox proportional hazards model seem to perform better than decision trees, although the differences are not significant.

The results for the concordance index are very similar to those reported in Kattan et al. [13], although they have used a different validation technique (a repetitive drawing of 70% cases for training while using the remaining 30% for testing). They

Related work

While there exist various statistical techniques to model survival-type data (e.g. Kaplan–Meier modeling and Cox’s regression [16]), machine learning techniques that would appropriately consider censored data are rare. Most notable exceptions come from the area of artificial neural networks, but even there the techniques vary from ignoring censored patients to treating them properly through modeling the hazard function. For instance, Snow et al. [23] developed a neural network to predict

Conclusions

Deciding whether to operate on patients with clinically localized prostate cancer frequently requires the urologist to classify patients into expected groups such as ‘remission’ or ‘recur’. In this paper we show that models for prostate cancer recurrence that may potentially support the urologist’s decision making can be induced from data using standard machine learning techniques, provided that follow-up and censoring has been appropriately considered. For the latter, we propose a weighting

Acknowledgements

This work was generously supported by the Slovene Ministry of Science and Technology and the Office of Information Technology at Baylor College of Medicine. The authors are grateful to Peter T. Scardino, M.D., of the Memorial Sloan Kettering Cancer Center for the sharing of his data.

References (25)

  • D Faraggi et al.

    A neural network model for survival data

    Stat. Med.

    (1995)
  • J.A Hanley et al.

    The meaning and use of the area under receiver operating characteristic curve

    Radiology

    (1982)
  • Cited by (103)

    • Machine learning algorithms for lamb survival

      2021, Computers and Electronics in Agriculture
      Citation Excerpt :

      In terms of lamb survival and factors affecting this very important parameter, ML methods can support us in creating predictive models by analyzing a large amount of data; and these methods can help us in decision-making. Machine learning researchers have developed sophisticated and effective algorithms which either complement or compete with the traditional statistical methods (Zupan et al., 2000). The ability to study animal behaviour is important in many fields of science; and behavioral data represents large or open-ended data volumes which require machine learning techniques to automatically classify these large datasets into behavioural classes (Le Roux et al., 2017).

    View all citing articles on Scopus
    View full text