Adapted variable precision rough set approach for EEG analysis

doi:10.1016/j.artmed.2009.07.004

Artificial Intelligence in Medicine

Volume 47, Issue 3, November 2009, Pages 239-261

https://doi.org/10.1016/j.artmed.2009.07.004 Get rights and content

Summary

Objective

Rough set theory (RST) provides powerful methods for reduction of attributes and creation of decision rules, which have successfully been applied in numerous medical applications. The variable precision rough set model (VPRS model), an extension of the original rough set approach, tolerates some degree of misclassification of the training data. The basic idea of the VPRS model is to change the class information of those objects whose class information cannot be induced without contradiction from the available attributes. Thereafter, original methods of RST are applied.

An approach of this model is presented that allows uncertain objects to change class information during the process of attribute reduction and rule generation. This method is referred to as variable precision rough set approach with flexible classification of uncertain objects (VPRS(FC) approach) and needs only slight modifications of the original VPRS model.

Methods and material

To compare the VPRS model and VPRS(FC) approach both methods are applied to a clinical data set based on electroencephalogram of awake and anesthetized patients. For comparison, a second data set obtained from the UCI machine learning repository is used. It describes the shape of different vehicle types. Further well known feature selection methods were applied to both data sets to compare their results with the results provided by rough set based approaches.

Results

The VPRS(FC) approach requires higher computational effort, but is able to achieve better reduction of attributes for noisy or inconsistent data and provides smaller rule sets.

Conclusion

The presented approach is a useful method for substantial attribute reduction in noisy and inconsistent data sets.

Introduction

Rough Set Theory (RST), as introduced by Pawlak [1], [2], provides methods for knowledge reduction and induction of decision rules which have successfully been applied in numerous medical applications [3], [4], [5], [6], [7], [8]. Analysis of electroencephalogram (EEG) data has also been subject of rough set applications [9], [10], [11]. EEG signals are electrical potentials generated by brain activity and measured on the scalp by non-invasive electrodes. These signals have amplitudes of typically some 10 μV. EEG is usually distorted by a high level of noise and by different types of artifacts. Therefore, the analysis of EEG-based data is a particular challenge for methods of artificial intelligence.

The original RST is able to handle inconsistent granular data, but does not tolerate noise or inexact attribute values. The variable precision rough set (VPRS) model overcomes this drawback by allowing a previously defined degree of misclassification.

The VPRS model considers some objects of the given data set as misclassified or uncertain. In a first step, the full attribute set is used to evaluate which objects are regarded as misclassified. The class information (decision) of these objects is changed resulting in a data set that can be handled by methods of the original RST. So, the initial definition of misclassified objects is preserved throughout the entire process.

We present an approach of the VPRS model referred to as variable precision rough set approach with flexible classification of uncertain objects (VPRS(FC) approach). In contrast to the VPRS model, misclassified objects are identified during the reduction process: First a reduced attribute set is selected, then misclassified objects are identified. This procedure ignores the information of removed attributes completely and may be advantageous when removed attributes are noisy.

The VPRS(FC) approach satisfies largely the main definitions of the VPRS model, only slight modifications have to be done. As a result, the VPRS(FC) approach can achieve a better reduction of attributes and smaller sets of decision rules, which are easier to interpret and allow a faster classification of new objects.

However, some fundamental assumptions of the original RST are no longer valid. As a consequence, it is not possible to apply techniques for the limitation of the computational effort.

In the next section some fundamentals of the original RST are introduced with a focus on those statements, which are affected by the VPRS(FC) approach. Section 3 gives a survey of the VPRS model. In Section 4 the VPRS(FC) approach is introduced and related work is reviewed. Particularly similar investigations from Mi [12] and Inuiguchi [13] also allowing flexible classification are considered. Section 5 states the relation of rough set based attribute reduction to other feature selection methods. In Section 6 the VPRS model and the VPRS(FC) approach are applied to a clinical data set based on EEG signals from anesthetized unconscious and awake patients. To prove that the VPRS(FC) approach performs well on different types of data, it is applied to a vehicle database from the university of California Irvine (UCI) machine learning repository and results are compared with VPRS model. Alternatively, common feature selection methods are applied on the identical data sets and resulting classification rates are presented in Section 8. In Section 9 the results are summarized and discussed.

Section snippets

Some basic notations of rough set theory

Rough set theory [2] operates on knowledge which is represented by a decision system S = (U, A, V, f), where U is a non-empty finite set of objects, called universe. A = C ∪ {d} is a set of attributes comprising a set C of condition attributes and a decision attribute d, which provides the class information of the objects. V_a is the domain of a single attribute a and V is the set of all domains: V = ∪ {V_a}, a ∈ A. For an object x, the value of an attribute a is given by the so-called information function f

Principles of variable precision rough set model

The original RST is able to handle inconsistencies in the data, which occur when indiscernible objects are assigned to different classes. However, the values of condition and decision attributes are expected to be exact. Noisy or vague data are beyond the scope of RST. In many real word applications the assumption of exact data is not fulfilled and some objects are misclassified or condition attribute values are corrupted.

Even in noisy data we can assume that most of the attribute values are

Flexible classification of uncertain objects

The presented alternative method directly uses the decision table for the calculation of relative reducts. For each set of condition attributes B ⊆ C, the validity of Eqs. (6a), (6b) is checked. In contrast to the common way of calculating β-relative reducts, the original values of the decision attribute remain unchanged until a subset B of condition attributes is selected. Then, the values of the decision attribute are determined according to the majority in the B-elementary sets, when the

Relation to other feature selection methods

RST and its extensions is the basis for selection of relevant attributes in numerous applications in different fields [12], [13], [23], [31], [32], [33]. In the present section, the relation between RST based methods and some important other feature selection methods used for classification problems is illustrated.

EEG analysis by application of rough set methods

In this section, the rough set based attribute reduction methods are applied to a classification task of EEG signals. In addition to standard clinical monitoring, the EEG can provide additional information to assess the level of anesthesia [38]. Unfortunately, changes of the EEG signal are very complex and the anesthetist may not be able to judge the EEG patterns continuously in the operation theatre. Therefore, an automatic analysis and classification of the signals is essential for the

Application of rough set methods to vehicle data

As a second example, the two approaches are applied to a set of vehicle data obtained from the UCI machine learning repository [43]. In this database the decision attribute can take more than two different decisions. It represents a kind of data typical for medical applications that have to distinguish between several diagnoses.

Feature selection with alternative methods

This section presents feature selection from both EEG and vehicle data performed by alternative methods in order to compare the results with results from relative reducts according VPRS model and VPRS(FC) approach. This will demonstrate strengths and weaknesses of the rough set methods compared to well established feature selection methods. For comparison, classification based on the resulting feature sets is performed by rule generation and cross validation as described in Sections 6 EEG

Summary and conclusions

We presented an approach of the variable precision rough set model called the VPRS(FC) approach and applied this method to analyze EEG data from a clinical study. In contrast to the original model, new decisions are assigned to uncertain objects during and not before calculation of relative reducts and rules. The main difference between VPRS model (and also RST) and VPRS(FC) approach is that the positive region of the VPRS(FC) approach can increase, when attributes are removed. In general, the

Acknowledgements

The authors wish to gratefully acknowledge professor Ziarko for the fruitful discussions on this subject and some helpful advices for this paper.

We also thank the Turing Institute, Glasgow, Scotland that donated the vehicle database and the UCI machine learning repository for publishing this database.

This study was in part supported by a grant from B. Braun AG Melsungen, Germany.

References (45)

J. Komorowski et al.
Modelling prognostic power of cardiac tests using rough sets
Artificial Intelligence in Medicine
(1999)
M. Szczuka et al.
Neuro-wavelet classifiers for EEG signals based on rough set methods
Neurocomputing
(2001)
J.-S. Mi et al.
Approaches to knowledge reduction based on variable precision rough set model
Information Sciences
(2004)
Q. Shen et al.
A rough-fuzzy approach for generating classification rules
Pattern Recognition Letters
(2002)
Q. Shen et al.
Combining rough sets and data-driven fuzzy learning for generation of classification rules
Pattern Recognition
(1999)
Y. Qian et al.
Converse approximation and rule extracting from decision tables in rough set theory
Computers & Mathematics with Applications
(2008)
W. Ziarko
Variable precision rough set model
Journal of Computer and System Sciences
(1993)
M.J. Beynon et al.
Variable precision rough set theory and data discretisation: an application to corporate failure prediction
Omega
(2001)
S. Tsumoto
Automated extraction of hierarchical decision rules from clinical databases using rough set model
Expert Systems with Applications
(2003)
W. Zhu et al.
Reduction and axiomization of covering generalized rough sets
Information Sciences
(2003)

R. Swiniarski et al.

Rough set methods in feature selection and recognition

Pattern Recognition Letters

(2003)

R.B. Bhatt et al.

On fuzzy-rough sets approach to feature selection

Pattern Recognition Letters

(2005)

M. Kudo et al.

Comparison of algorithms that select features for pattern classifiers

Pattern Recognition

(2000)

F. Marcelloni

Feature selection based on a modified fuzzy C-means algorithm with supervision

Information Sciences

(2003)

X.-W. Chen

An improved branch and bound algorithm for feature selection

Pattern Recognition Letters

(2003)

Y. Qian et al.

Measures for evaluating the decision performance of a decision table in rough set theory

Information Sciences

(2008)

Z. Pawlak

Rough sets

International Journal of Computer and Information Sciences

(1982)

Z. Pawlak

Rough sets, theoretical aspects of reasoning about data

(1991)

S. Tsumoto

Automated discovery of positive and negative knowledge in clinical databases

IEEE Engineering in Medicine and Biology

(2000)

K. Slowinski et al.

Medical information systems—problems with analysis and way of solution

A. Ohrn et al.

Rough sets: a knowledge discovery technique for multifactorial medical outcomes

American Journal of Physical Medicine & Rehabilitation

(2000)

H. Midelfart et al.

Learning rough set classifiers from gene expressions and clinical data

Fundamenta Informaticae

(2002)

Cited by (54)

An expanded double-quantitative model regarding probabilities and grades and its hierarchical double-quantitative attribute reduction
2015, Information Sciences
Citation Excerpt :
VPRS [72] exhibited error tolerance by considering relative misclassification. For VPRS, Refs. [2,6,24,26,43] studied attribute reduction, and Refs. [28,30,46,47,49] performed model applications in geological and medical fields, among others. GRS [56] was proposed by exploring relationships between rough sets and modal logics, and Refs. [20,48,57] reported some model constructions.
Probabilities and grades serve as relative and absolute measures, respectively. They are used to establish the decision-theoretic rough set (DTRS) and graded rough set (GRS) – two basic quantitative models. The double-quantification of probabilities and grades exhibits systematicness and completeness in view of the two-dimensional feature of the approximate space; however, double-quantitative construction becomes a problem, and double-quantitative reduction is rarely reported. Thus, this paper mainly constructs an expanded double-quantitative model by logically integrating probabilities and grades; it further studies relevant double-quantitative reduction by hierarchically preserving specific regions. (1) First, a novel model is established via the logic integration and expansion requirement, and its regional system and granular hierarchy are studied via granular computing. Thus, regional semantics is extracted via basic semantics granules. Regional calculation is realized by two algorithms, and the algorithm regarding calculation granules exhibits optimization according to algorithm analyses. (2) Second, three types of model-regional preservation reducts and their hierarchy are discussed in the two-category case. Thus, SRP-Reduct, CRP-Reduct, and APP-Reduct are studied by exploring four-region preservation properties, constructing two-region classification regions, and preserving four original approximations, respectively. Furthermore, a relevant reduction hierarchy is thoroughly achieved. (3) Moreover, the model and its reduction are illustrated by two examples of decision tables. The constructional model conducts double-quantification regarding probabilities and grades; thus, it exhibits double-quantitative semantics and benignly expands DTRS-Model, GRS-Model, and Pawlak-Model. Furthermore, its hierarchical reduction reflects some double-quantitative reduction essence; thus, its reduction expands qualitative Pawlak-Reduction while guides quantitative DTRS-Reduction and GRS-Reduction.
A rough set-based corporate memory for the case of ecotourism
2015, Tourism Management
Citation Excerpt :
It is suitable for processing qualitative information that is difficult to analyze using standard statistical techniques (Heckerman et al., 1997). RST has been applied in numerous fields, for example, supplier selection (Bai & Sarkis; 2010), decision support systems (Degang & Suyun, 2010), investment portfolios (Shyng, Shieh, Tzeng, & Hsieh, 2010), marketing (Cheng, Chen, & Lin, 2010), EEG analysis (Ningler, Stockmanns, Schneider, Kochs, & Kochs, 2009), multiple criteria classification (Dembczyński, Greco, & Słowiński, 2009; Zhang, Shi, & Gao, 2009), expert systems (Shao, Chu, Qiu, Gao, & Yan, 2009), databases (Wang & Wang, 2009), time series analysis (Sarkar, 2006; Teoh, Cheng, Chu, & Chen, 2008; Yao & Herbert, 2009), image compression & segmentation (Mushrif & Ray, 2008; Petrosino & Ferone, 2008) and boolean reasoning (Pawlak & Skowron, 2007). In RST, a general decision table for the representation of the relationship between condition attributes and decision attributes is used.
Corporate memory (CM) is a major asset of any modern organization and provides access to the strategic knowledge and experience making a company more competitive. Until now, CM has not been broadly applied to tourisms, where changes are rapid, both in the nature of eco-tourist behavior and impact on the environment. In order to develop sustainable ecotourism, agile decision-making based on rules induced from data is required. However, ecotourism often provides numerous qualitative data. The qualitative nature of the data makes it difficult to analyze using standard statistical techniques. The rough set approach is suitable for processing qualitative information. In this paper, the proposed CM is incorporated within the rough set in the tourism sector, to provide efficient knowledge management for resolving the problems: (1) to understand the purposes for traveling of tourists and their feedback, and (2) to improve a travel package for attracting valued eco-tourists and reducing environmental damage.
Quantitative information architecture, granular computing and rough set models in the double-quantitative approximation space of precision and grade
2014, Information Sciences
Because precision and grade act as fundamental quantitative information in approximation space, they are used in relative and absolute quantifications, respectively. At present, the double quantification regarding precision and grade is a novel and valuable subject, but quantitative information fusion has become a key problem. Thus, this paper constructs the double-quantitative approximation space of precision and grade (PG-Approx-Space) and tackles the fusion problem using normal logical operations. It further conducts double-quantification studies on granular computing and rough set models. (1) First, for quantitative information organization and storage, we construct space and plane forms of PG-Approx-Space using the Cartesian product, and for quantitative information extraction and fusion, we establish semantics construction and semantics granules of PG-Approx-Space. (2) Second, by granular computing, we investigate three primary granular issues: quantitative semantics, complete system and optimal calculation. Accordingly, six types of fundamental granules are proposed based on the semantic, microscopic and macroscopic descriptions; their semantics, forms, structures, calculations and relationships are studied, and the granular hierarchical structure is achieved. (3) Finally, we investigate rough set models in PG-Approx-Space. Accordingly, model regions are proposed by developing the classical regions, model expansion is systematically analyzed, some models are constructed as their structures are obtained, and a concrete model is provided. Based on the quantitative information architecture, this paper systematically conducts and investigates double quantification and establishes a fundamental and general exploration framework.
Noninvasive evaluation of mental stress using by a refined rough set technique based on biomedical signals
2014, Artificial Intelligence in Medicine
Citation Excerpt :
In 1999, Komorowski and Øhrn [10] introduced a RST and Boolean reasoning approach for predicting of cardiac disease. A decade later, Ningler et al. [11] used an enhanced variable precision RST to conduct electroencephalographic analysis. The advantage of this improved method was that it enabled uncertain objects to change class information during the process of attribute reduction and generation, thereby enabling attribute reduction to be achieved when data are noisy or inconsistent, and providing small rule sets.
Evaluating and treating of stress can substantially benefits to people with health problems. Currently, mental stress evaluated using medical questionnaires. However, the accuracy of this evaluation method is questionable because of variations caused by factors such as cultural differences and individual subjectivity. Measuring of biomedical signals is an effective method for estimating mental stress that enables this problem to be overcome. However, the relationship between the levels of mental stress and biomedical signals remain poorly understood.
A refined rough set algorithm is proposed to determine the relationship between mental stress and biomedical signals, this algorithm combines rough set theory with a hybrid Taguchi-genetic algorithm, called RS-HTGA. Two parameters were used for evaluating the performance of the proposed RS-HTGA method. A dataset obtained from a practice clinic comprising 362 cases (196 male, 166 female) was adopted to evaluate the performance of the proposed approach.
The empirical results indicate that the proposed method can achieve acceptable accuracy in medical practice. Furthermore, the proposed method was successfully used to identify the relationship between mental stress levels and bio-medical signals. In addition, the comparison between the RS-HTGA and a support vector machine (SVM) method indicated that both methods yield good results. The total averages for sensitivity, specificity, and precision were greater than 96%, the results indicated that both algorithms produced highly accurate results, but a substantial difference in discrimination existed among people with Phase 0 stress. The SVM algorithm shows 89% and the RS-HTGA shows 96%. Therefore, the RS-HTGA is superior to the SVM algorithm. The kappa test results for both algorithms were greater than 0.936, indicating high accuracy and consistency. The area under receiver operating characteristic curve for both the RS-HTGA and a SVM method were greater than 0.77, indicating a good discrimination capability.
In this study, crucial attributes in stress evaluation were successfully recognized using biomedical signals, thereby enabling the conservation of medical resources and elucidating the mapping relationship between levels of mental stress and candidate attributes. In addition, we developed a prototype system for mental stress evaluation that can be used to provide benefits in medical practice.
Two basic double-quantitative rough set models of precision and grade and their investigation using granular computing
2013, International Journal of Approximate Reasoning
The precision and grade of the approximate space are two fundamental quantitative indexes that measure the relative and absolute quantitative information, respectively. The double quantification of the precision and grade is a relatively new subject, and its effective implementation remains an open problem. This paper approaches the double quantification problem using basic rough set models. The Cartesian product is a natural operator for combining the two indexes given their completeness and complementary natures, and we construct two new models using this strategy. The fundamental items (i.e., the complete system, quantitative semantics and optimal computing) of the model regions are studied using granular computing. First, the model regions (MR granules) and basic model regions (BMR granules) are defined in the traditional fashion using logical double-quantitative semantics; basic semantics (BS) is provided for the double-semantic description, and the semantic extraction of the MR and BMR granules is realized within the BS framework. Computing granules (BMRC granules) are then proposed for the basic model regions to optimize the computation, and a two-dimensional plane and granular hierarchical structure are provided. Two basic algorithms for computing the MR and BMR granules are proposed and analyzed, and the BMRC-granules algorithm generally exhibits superior performance in terms of the temporal and spatial complexity. We also explore the properties of the approximation operators and the notions of attribute approximate dependence and reduction. Finally, we provide an example application from the medical field. The two models provide a basic double quantification of the precision and grade and have concrete double-quantitative semantics; they also represent a quantitatively complete expansion of the Pawlak model.
Uncertainty Quantification for Healthcare Data
2023, IISE Annual Conference and Expo 2023

View all citing articles on Scopus

View full text

Adapted variable precision rough set approach for EEG analysis

Summary

Objective

Methods and material

Results

Conclusion

Introduction

Section snippets

Some basic notations of rough set theory

Principles of variable precision rough set model

Flexible classification of uncertain objects

Relation to other feature selection methods

EEG analysis by application of rough set methods

Application of rough set methods to vehicle data

Feature selection with alternative methods

Summary and conclusions

Acknowledgements

Artificial Intelligence in Medicine

Neurocomputing

Information Sciences

Pattern Recognition Letters

Pattern Recognition

Computers & Mathematics with Applications

Journal of Computer and System Sciences

Omega

Expert Systems with Applications

Information Sciences

Pattern Recognition Letters

Pattern Recognition Letters

Pattern Recognition

Information Sciences

Pattern Recognition Letters

Information Sciences

Rough sets

International Journal of Computer and Information Sciences

Rough sets, theoretical aspects of reasoning about data

Automated discovery of positive and negative knowledge in clinical databases

IEEE Engineering in Medicine and Biology

Medical information systems—problems with analysis and way of solution

Rough sets: a knowledge discovery technique for multifactorial medical outcomes

American Journal of Physical Medicine & Rehabilitation

Learning rough set classifiers from gene expressions and clinical data

Fundamenta Informaticae