Incorporating repeating temporal association rules in Naïve Bayes classifiers for coronary heart disease diagnosis

https://doi.org/10.1016/j.jbi.2018.03.002Get rights and content
Under an Elsevier user license
open archive

Highlights

  • Four Naïve Bayes classifiers representing temporal association rules (TARs) as features are developed using different feature representation methods.

  • The selection of the most frequent TARs is based on their support and confidence values.

  • Different feature selection and representation methods are tested.

  • First Classifier: Only the most frequent TARs that better discriminate the target class are selected as features.

  • Second Classifier: The most frequent TARs after removing the ones which are not good predictors of the disease based on medical knowledge are selected as features.

  • Feature representation methods include the binary representation, the horizontal support, and the mean duration.

Abstract

In this paper, we develop a Naïve Bayes classification model integrated with temporal association rules (TARs). A temporal pattern mining algorithm is used to detect TARs by identifying the most frequent temporal relationships among the derived basic temporal abstractions (TA). We develop and compare three classifiers that use as features the most frequent TARs as follows: (i) representing the most frequent TARs detected within the target class (‘Disease = Present’), (ii) representing the most frequent TARs from both classes (‘Disease = Present’, ‘Disease = Absent’), (iii) representing the most frequent TARs, after removing the ones that are low-risk predictors for the disease. These classifiers incorporate the horizontal support of TARs, which defines the number of times that a particular temporal pattern is found in some patient’s record, as their features. All of the developed classifiers are applied for diagnosis of coronary heart disease (CHD) using a longitudinal dataset. We compare two ways of feature representation, using horizontal support or the mean duration of each TAR, on a single patient. The results obtained from this comparison show that the horizontal support representation outperforms the mean duration. The main effort of our research is to demonstrate that where long time periods are of significance in some medical domain, such as the CHD domain, the detection of the repeated occurrences of the most frequent TARs can yield better performances. We compared the classifier that uses the horizontal support representation and has the best performance with a Baseline Classifier which uses the binary representation of the most frequent TARs. The results obtained illustrate the comparatively high performance of the classifier representing the horizontal support, over the Baseline Classifier.

Keywords

Bayesian models
Time series classification
Temporal abstraction
Temporal reasoning
Temporal association rules

Cited by (0)