Adapted variable precision rough set approach for EEG analysis

https://doi.org/10.1016/j.artmed.2009.07.004Get rights and content

Summary

Objective

Rough set theory (RST) provides powerful methods for reduction of attributes and creation of decision rules, which have successfully been applied in numerous medical applications. The variable precision rough set model (VPRS model), an extension of the original rough set approach, tolerates some degree of misclassification of the training data. The basic idea of the VPRS model is to change the class information of those objects whose class information cannot be induced without contradiction from the available attributes. Thereafter, original methods of RST are applied.

An approach of this model is presented that allows uncertain objects to change class information during the process of attribute reduction and rule generation. This method is referred to as variable precision rough set approach with flexible classification of uncertain objects (VPRS(FC) approach) and needs only slight modifications of the original VPRS model.

Methods and material

To compare the VPRS model and VPRS(FC) approach both methods are applied to a clinical data set based on electroencephalogram of awake and anesthetized patients. For comparison, a second data set obtained from the UCI machine learning repository is used. It describes the shape of different vehicle types. Further well known feature selection methods were applied to both data sets to compare their results with the results provided by rough set based approaches.

Results

The VPRS(FC) approach requires higher computational effort, but is able to achieve better reduction of attributes for noisy or inconsistent data and provides smaller rule sets.

Conclusion

The presented approach is a useful method for substantial attribute reduction in noisy and inconsistent data sets.

Introduction

Rough Set Theory (RST), as introduced by Pawlak [1], [2], provides methods for knowledge reduction and induction of decision rules which have successfully been applied in numerous medical applications [3], [4], [5], [6], [7], [8]. Analysis of electroencephalogram (EEG) data has also been subject of rough set applications [9], [10], [11]. EEG signals are electrical potentials generated by brain activity and measured on the scalp by non-invasive electrodes. These signals have amplitudes of typically some 10 μV. EEG is usually distorted by a high level of noise and by different types of artifacts. Therefore, the analysis of EEG-based data is a particular challenge for methods of artificial intelligence.

The original RST is able to handle inconsistent granular data, but does not tolerate noise or inexact attribute values. The variable precision rough set (VPRS) model overcomes this drawback by allowing a previously defined degree of misclassification.

The VPRS model considers some objects of the given data set as misclassified or uncertain. In a first step, the full attribute set is used to evaluate which objects are regarded as misclassified. The class information (decision) of these objects is changed resulting in a data set that can be handled by methods of the original RST. So, the initial definition of misclassified objects is preserved throughout the entire process.

We present an approach of the VPRS model referred to as variable precision rough set approach with flexible classification of uncertain objects (VPRS(FC) approach). In contrast to the VPRS model, misclassified objects are identified during the reduction process: First a reduced attribute set is selected, then misclassified objects are identified. This procedure ignores the information of removed attributes completely and may be advantageous when removed attributes are noisy.

The VPRS(FC) approach satisfies largely the main definitions of the VPRS model, only slight modifications have to be done. As a result, the VPRS(FC) approach can achieve a better reduction of attributes and smaller sets of decision rules, which are easier to interpret and allow a faster classification of new objects.

However, some fundamental assumptions of the original RST are no longer valid. As a consequence, it is not possible to apply techniques for the limitation of the computational effort.

In the next section some fundamentals of the original RST are introduced with a focus on those statements, which are affected by the VPRS(FC) approach. Section 3 gives a survey of the VPRS model. In Section 4 the VPRS(FC) approach is introduced and related work is reviewed. Particularly similar investigations from Mi [12] and Inuiguchi [13] also allowing flexible classification are considered. Section 5 states the relation of rough set based attribute reduction to other feature selection methods. In Section 6 the VPRS model and the VPRS(FC) approach are applied to a clinical data set based on EEG signals from anesthetized unconscious and awake patients. To prove that the VPRS(FC) approach performs well on different types of data, it is applied to a vehicle database from the university of California Irvine (UCI) machine learning repository and results are compared with VPRS model. Alternatively, common feature selection methods are applied on the identical data sets and resulting classification rates are presented in Section 8. In Section 9 the results are summarized and discussed.

Section snippets

Some basic notations of rough set theory

Rough set theory [2] operates on knowledge which is represented by a decision system S = (U, A, V, f), where U is a non-empty finite set of objects, called universe. A = C  {d} is a set of attributes comprising a set C of condition attributes and a decision attribute d, which provides the class information of the objects. Va is the domain of a single attribute a and V is the set of all domains: V =  {Va}, a  A. For an object x, the value of an attribute a is given by the so-called information function f

Principles of variable precision rough set model

The original RST is able to handle inconsistencies in the data, which occur when indiscernible objects are assigned to different classes. However, the values of condition and decision attributes are expected to be exact. Noisy or vague data are beyond the scope of RST. In many real word applications the assumption of exact data is not fulfilled and some objects are misclassified or condition attribute values are corrupted.

Even in noisy data we can assume that most of the attribute values are

Flexible classification of uncertain objects

The presented alternative method directly uses the decision table for the calculation of relative reducts. For each set of condition attributes B  C, the validity of Eqs. (6a), (6b) is checked. In contrast to the common way of calculating β-relative reducts, the original values of the decision attribute remain unchanged until a subset B of condition attributes is selected. Then, the values of the decision attribute are determined according to the majority in the B-elementary sets, when the

Relation to other feature selection methods

RST and its extensions is the basis for selection of relevant attributes in numerous applications in different fields [12], [13], [23], [31], [32], [33]. In the present section, the relation between RST based methods and some important other feature selection methods used for classification problems is illustrated.

EEG analysis by application of rough set methods

In this section, the rough set based attribute reduction methods are applied to a classification task of EEG signals. In addition to standard clinical monitoring, the EEG can provide additional information to assess the level of anesthesia [38]. Unfortunately, changes of the EEG signal are very complex and the anesthetist may not be able to judge the EEG patterns continuously in the operation theatre. Therefore, an automatic analysis and classification of the signals is essential for the

Application of rough set methods to vehicle data

As a second example, the two approaches are applied to a set of vehicle data obtained from the UCI machine learning repository [43]. In this database the decision attribute can take more than two different decisions. It represents a kind of data typical for medical applications that have to distinguish between several diagnoses.

Feature selection with alternative methods

This section presents feature selection from both EEG and vehicle data performed by alternative methods in order to compare the results with results from relative reducts according VPRS model and VPRS(FC) approach. This will demonstrate strengths and weaknesses of the rough set methods compared to well established feature selection methods. For comparison, classification based on the resulting feature sets is performed by rule generation and cross validation as described in Sections 6 EEG

Summary and conclusions

We presented an approach of the variable precision rough set model called the VPRS(FC) approach and applied this method to analyze EEG data from a clinical study. In contrast to the original model, new decisions are assigned to uncertain objects during and not before calculation of relative reducts and rules. The main difference between VPRS model (and also RST) and VPRS(FC) approach is that the positive region of the VPRS(FC) approach can increase, when attributes are removed. In general, the

Acknowledgements

The authors wish to gratefully acknowledge professor Ziarko for the fruitful discussions on this subject and some helpful advices for this paper.

We also thank the Turing Institute, Glasgow, Scotland that donated the vehicle database and the UCI machine learning repository for publishing this database.

This study was in part supported by a grant from B. Braun AG Melsungen, Germany.

References (45)

  • R. Swiniarski et al.

    Rough set methods in feature selection and recognition

    Pattern Recognition Letters

    (2003)
  • R.B. Bhatt et al.

    On fuzzy-rough sets approach to feature selection

    Pattern Recognition Letters

    (2005)
  • M. Kudo et al.

    Comparison of algorithms that select features for pattern classifiers

    Pattern Recognition

    (2000)
  • F. Marcelloni

    Feature selection based on a modified fuzzy C-means algorithm with supervision

    Information Sciences

    (2003)
  • X.-W. Chen

    An improved branch and bound algorithm for feature selection

    Pattern Recognition Letters

    (2003)
  • Y. Qian et al.

    Measures for evaluating the decision performance of a decision table in rough set theory

    Information Sciences

    (2008)
  • Z. Pawlak

    Rough sets

    International Journal of Computer and Information Sciences

    (1982)
  • Z. Pawlak

    Rough sets, theoretical aspects of reasoning about data

    (1991)
  • S. Tsumoto

    Automated discovery of positive and negative knowledge in clinical databases

    IEEE Engineering in Medicine and Biology

    (2000)
  • K. Slowinski et al.

    Medical information systems—problems with analysis and way of solution

  • A. Ohrn et al.

    Rough sets: a knowledge discovery technique for multifactorial medical outcomes

    American Journal of Physical Medicine & Rehabilitation

    (2000)
  • H. Midelfart et al.

    Learning rough set classifiers from gene expressions and clinical data

    Fundamenta Informaticae

    (2002)
  • Cited by (54)

    • An expanded double-quantitative model regarding probabilities and grades and its hierarchical double-quantitative attribute reduction

      2015, Information Sciences
      Citation Excerpt :

      VPRS [72] exhibited error tolerance by considering relative misclassification. For VPRS, Refs. [2,6,24,26,43] studied attribute reduction, and Refs. [28,30,46,47,49] performed model applications in geological and medical fields, among others. GRS [56] was proposed by exploring relationships between rough sets and modal logics, and Refs. [20,48,57] reported some model constructions.

    • A rough set-based corporate memory for the case of ecotourism

      2015, Tourism Management
      Citation Excerpt :

      It is suitable for processing qualitative information that is difficult to analyze using standard statistical techniques (Heckerman et al., 1997). RST has been applied in numerous fields, for example, supplier selection (Bai & Sarkis; 2010), decision support systems (Degang & Suyun, 2010), investment portfolios (Shyng, Shieh, Tzeng, & Hsieh, 2010), marketing (Cheng, Chen, & Lin, 2010), EEG analysis (Ningler, Stockmanns, Schneider, Kochs, & Kochs, 2009), multiple criteria classification (Dembczyński, Greco, & Słowiński, 2009; Zhang, Shi, & Gao, 2009), expert systems (Shao, Chu, Qiu, Gao, & Yan, 2009), databases (Wang & Wang, 2009), time series analysis (Sarkar, 2006; Teoh, Cheng, Chu, & Chen, 2008; Yao & Herbert, 2009), image compression & segmentation (Mushrif & Ray, 2008; Petrosino & Ferone, 2008) and boolean reasoning (Pawlak & Skowron, 2007). In RST, a general decision table for the representation of the relationship between condition attributes and decision attributes is used.

    • Noninvasive evaluation of mental stress using by a refined rough set technique based on biomedical signals

      2014, Artificial Intelligence in Medicine
      Citation Excerpt :

      In 1999, Komorowski and Øhrn [10] introduced a RST and Boolean reasoning approach for predicting of cardiac disease. A decade later, Ningler et al. [11] used an enhanced variable precision RST to conduct electroencephalographic analysis. The advantage of this improved method was that it enabled uncertain objects to change class information during the process of attribute reduction and generation, thereby enabling attribute reduction to be achieved when data are noisy or inconsistent, and providing small rule sets.

    • Uncertainty Quantification for Healthcare Data

      2023, IISE Annual Conference and Expo 2023
    View all citing articles on Scopus
    View full text