Attribute reduction methods in fuzzy rough set theory: An overview, comparative experiments, and new directions

doi:10.1016/j.asoc.2021.107353

Applied Soft Computing

Volume 107, August 2021, 107353

https://doi.org/10.1016/j.asoc.2021.107353 Get rights and content

Highlights

•
Attribute reduction methods based on fuzzy rough set theory are comprehensively reviewed.
•
All methods are summarized through six different aspects.
•
The experimental results clarify that the three types of reduction methods are effective.
•
Statistical hypothesis test verified that the performance of these algorithms is different.

Abstract

Fuzzy rough set theory is a powerful tool to deal with uncertainty information, which has been successfully applied to the fields of attribute reduction, rule extraction, classification tree induction, etc. In order to comprehensively investigate attribute reduction methods in fuzzy rough set theory, this paper first briefly reviews the related concepts of fuzzy rough set theory. Then, all methods are summarized through six different aspects including data sources, preprocessing methods, fuzzy similarity metrics, fuzzy operations, reduction rules, and evaluation methods. Among them, reduction rules are reviewed in three categories, i.e., fuzzy dependency-based, fuzzy uncertainty measure-based, and fuzzy discernibility matrix-based. These three types of reduction rules are compared and analyzed through experiments. The experimental results clarify that these three reduction rules can retain fewer attributes and improve or maintain the classification accuracy of a classifier. Moreover, the statistical hypothesis test is conducted to evaluate the statistical difference of these methods. The results show that these algorithms are statistically significantly different. Finally, some new research directions are discussed.

Introduction

In recent years, rough set theory has received extensive attention in many fields such as data mining, pattern recognition, and machine learning. As a mathematical framework that can handle uncertainty data, the classical rough set (CRS) model has been successfully applied to attribute reduction (feature selection), rule extraction, and uncertainty reasoning [1], [2], [3], [4], [5], [6], [7]. Equivalence relations is used in CRS, which is only suitable for categorical (nominal) attributes. In fact, categorical and numerical attributes (i.e., mixed attribute) often coexist in real-life databases, such as attributes that exist in medical analysis and fault diagnosis [8]. One of the feasible methods to deal with numerical attributes is to discretize the numerical attributes. However, discretization is usually an important reason for information loss. In response to the above-mentioned problems, Dubois and Prade proposed Fuzzy Rough Set (FRS) model [9], [10], which provides an effective way to overcome the problem of discretization in real-value data and can be directly applied to numerical or mixed features.

The existing researches on FRS theory can be roughly divided into two topics: the construction of approximate space and the application of methods. On the one hand, a series of extensions on FRSs [11], [12], [13], [14], [15] are investigated according to the different requirements or application scenarios. On the other hand, FRSs have been successfully applied to many fields, such as attribute reduction [16], [17], [18], [19], rule extraction [20], classification tree induction [21], medical analysis [22], etc. Among them, attribute reduction is one of hot research topics in FRS theory, which is to delete irrelevant or unimportant attributes while keeping the ability to classification unchanged. The attributes given by attribute reduction may further used to construct learning models with better generalization ability and lower computation consumption. According to whether there is decision (label) information, the existing attribute reduction methods can be divided into supervised [23], semi-supervised [24], [25] and unsupervised [26], [27]. In addition, according to different search strategies, attribute reduction methods can also be divided into three types: filter-based method [28], wrapper-based method [29], and embedded-based method [30]. In recent years, two other search strategies have also been studied, namely hybrid method [31] and ensemble [25], [32] method. However, the review literature of attribute reduction methods based on FRS theory is still rare. Recently, in [33], authors reviewed the feature selection based on FRS theory, and they divided it into two types of methods: fuzzy dependency-based and fuzzy discernibility matrix-based method. In fact, the feature selection method based on FRSs also includes fuzzy uncertainty measure-based methods. What is more, comparative experiments on the related attribute reduction methods are not reported. Therefore, it is necessary to further discuss attribute reduction methods based on FRS theory.

In view of the above discussion, this paper focuses on a comprehensive overview on attribute reduction methods based on FRS theory before January 1, 2021. First, the FRS theory is reviewed. Then, in order to clarify the application of FRS theory in attribute reduction, all methods are summarized from six aspects including data sources, preprocessing methods, fuzzy similarity metrics, fuzzy operations, reduction rules, and evaluation methods. Among them, attribute reduction rules are divided into three categories: (1) fuzzy dependency-based, (2) fuzzy uncertainty measure-based, and (3) fuzzy discernibility matrix-based. Finally, these three types of attribute reduction rules are compared and analyzed through experiments. The experimental results clarify that these three types of methods can retain fewer attributes and improve or maintain the accuracy of a classifier. Besides, statistical hypothesis test is conducted to further evaluate the statistical difference of these methods. The results show that these algorithms are statistically significantly different. Further, some research directions in the future are discussed. The research framework of attribute reduction method based on FRS theory is shown in Fig. 1.

The rest of this paper is organized as follows. In Section 2, we briefly introduce the preliminaries about FRS theory. In Section 3, attribute reduction methods based on FRSs theory are summarized from six aspects. Discussions on these methods are carried out in Section 4. The analyses of the comparative experiments are presented in Section 5. Some future research directions are discussed in Section 6. Finally, Section 7 summarizes the paper.

Section snippets

Preliminaries

This section reviews some definitions and symbols of FRSs [8], [9], [13], [15].

A Fuzzy Information System (FIS) is a quadruple $F I S = (U, A, V, f)$ , where $U$ is a non-empty finite set of objects, called universe of discourse (universe); $A$ is a non-empty finite set of attributes; $V$ is a union of attribute domain, i.e., $V = ⋃_{a \in A} V_{a}$ , where $V_{a}$ is the attribute domain of the attribute $a$ ; $f : U \times A \to V$ is an information function that satisfies $f_{a} (x) \in V_{a}$ for $\forall a \in A$ and $x \in U$ . When $A = C \cup D$ and $C \cap D = 0̸$ , the FIS is called the

Attribute reduction based on FRSs

In 1992, Kuncheva first proposed a feature selection method by using FRS [38]. The weak fuzzy division was used to define the fuzzy positive, negative and boundary regions, and then the idea of feature selection was proposed. However, since FRS was just born at that time, they did not attract sufficient attention. In 2004, Jensen et al. proposed an attribute reduction method based on FRSs [39]. Since then, attribute reduction based on FRS has received widespread attention. According to the

Discussion

First, Table 4, Table 5, Table 6 give the detailed literature of fuzzy dependency-based, fuzzy uncertainty measure-based, and fuzzy discernibility matrix-based methods, respectively, where “–” indicates that the method used is not explicitly given.

Through Table 4, Table 5, Table 6, some comparative analyses are summarized as follows.

(1)
From the perspective of data sources, the data sources in most documents is Link (1);
(2)
From the perspective of preprocessing methods, most of the methods given are

Experiments

In this section, in order to test the performance of the above attribute reduction methods, Fuzzy Information Entropy-based (FIE) [89], Fuzzy Complement Entropy-based Feature Selection (FCEFS) [101], Fuzzy Rough-based Feature selection (FRFS) [41], Fuzzy Discernibility Matrix-based Feature Selection (FDMFS) [41], Fuzzy Discernibility Matrix-based Attribute Reduction (FDMAR) [111], Variable Precision Fuzzy Rough-based Feature Selection (VPRFS) [68], Fuzzy Neighborhood Rough Set-based (FNRS) [73]

Some remarks on new research directions

Attribute reduction is one of the core contents in FRS theory, and there are still many challenging problems that need to be solved and continued to be studied.

(1)
Unsupervised or semi-supervised methods. Most existing attribute reduction methods are supervised methods. In real life, there are a lot of data with no or only partial decision information, but obtaining the decision information of the object is a challenging task. Therefore, when there is no or only part of decision information, how to

Conclusion

This paper reviewed the existing attribute reduction methods based on FRS theory. Firstly, the research and development of FRS theory are summarized. Then, in order to clarify the application of FRS models in attribute reduction, all methods are summarized from six aspects. Furthermore, three types of attribute reduction methods are compared and analyzed through experiments. The experimental results showed that the attribute reduction method based on FRSs can retain fewer features and improve

CRediT authorship contribution statement

Zhong Yuan: Conceptualization, Writing - original draft, Read and contributed to the manuscript. Hongmei Chen: Supervision, Project administration, Read and contributed to the manuscript. Peng Xie: Visualization, Read and contributed to the manuscript. Pengfei Zhang: Structure fabrication, Read and contributed to the manuscript. Jia Liu: Structure fabrication, Read and contributed to the manuscript. Tianrui Li: Structure fabrication, Read and contributed to the manuscript.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Nos. 61976182, 62076171, 61876157, and 61976245), the Key Techniques of Integrated Operation and Maintenance for Urban Rail Train Dispatching Control System based on Artificial Intelligence (2019YFH0097), and Sichuan Key R&D project (2020YFG0035).

References (152)

HongT.P. et al.
Learning a coverage set of maximally general fuzzy rules by rough sets
Expert Syst. Appl.
(2000)
ThangavelK. et al.
Dimensionality reduction based on rough set theory: A review
Appl. Soft Comput.
(2009)
LiangJ.Y. et al.
A new measure of uncertainty based on knowledge granulation for rough sets
Inform. Sci.
(2009)
ZhangP.F. et al.
Multi-source information fusion based on rough set theory: A review
Inf. Fusion
(2021)
DuboisD. et al.
Putting rough sets and fuzzy sets together
RadzikowskaA.M. et al.
A comparative study of fuzzy rough sets
Fuzzy Sets and Systems
(2002)
MiJ.S. et al.
An axiomatic characterization of a fuzzy generalization of rough sets
Inform. Sci.
(2004)
MoserB.
On the T-transitivity of kernels
Fuzzy Sets and Systems
(2006)
MiJ.S. et al.
Generalized fuzzy rough sets determined by a triangular norm
Inform. Sci.
(2008)
HuQ.H. et al.
Robust fuzzy rough classifiers
Fuzzy Sets and Systems
(2011)

WangX.Z. et al.

Learning fuzzy rules from fuzzy samples based on rough set technique

Inform. Sci.

(2007)

HassanienA.

Fuzzy rough sets hybrid scheme for breast cancer detection

Image Vis. Comput.

(2007)

LiuK.Y. et al.

Rough set based semi-supervised feature selection via ensemble selector

Knowl.-Based Syst.

(2019)

Mac ParthaláInN. et al.

Unsupervised fuzzy-rough set-based dimensionality reduction

Inform. Sci.

(2013)

DashM. et al.

Consistency-based search in feature selection

Artificial Intelligence

(2003)

WangS. et al.

Embedded unsupervised feature selection

ZhangX. et al.

Feature selection in mixed data: A method using a novel fuzzy rough set-based information entropy

Pattern Recognit.

(2016)

BodjanovaS.

Approximation of fuzzy concepts in decision making

Fuzzy Sets and Systems

(1997)

WuW.Z. et al.

Generalized fuzzy rough sets

Inform. Sci.

(2003)

WuW.Z. et al.

Constructive and axiomatic approaches of fuzzy approximation operators

Inform. Sci.

(2004)

HuQ.H. et al.

Gaussian kernel based fuzzy rough sets: Model, uncertainty measures and applications

Internat. J. Approx. Reason.

(2010)

KunchevaL.I.

Fuzzy rough sets: application to feature selection

Fuzzy Sets and Systems

(1992)

JensenR. et al.

Fuzzy–rough attribute reduction with application to web categorization

Fuzzy Sets and Systems

(2004)

QianY. et al.

Fuzzy-rough feature selection accelerator

Fuzzy Sets and Systems

(2015)

ChenJ.K. et al.

A graph approach for fuzzy-rough feature selection

Fuzzy Sets and Systems

(2020)

LinY.J. et al.

Attribute reduction for multi-label learning with fuzzy rough set

Knowl.-Based Syst.

(2018)

YuanZ. et al.

Hybrid data-driven outlier detection based on neighborhood information entropy and its developmental measures

Expert Syst. Appl.

(2018)

KryszkiewiczM.

Rough set approach to incomplete information systems

Inform. Sci.

(1998)

QuinlanJ.R.

Unknown attribute values in induction

WangC.Z. et al.

Fuzzy rough set-based attribute reduction using distance measures

Knowl.-Based Syst.

(2019)

YuD.R. et al.

Uncertainty measures for fuzzy relations and their applications

Appl. Soft Comput.

(2007)

CornelisC. et al.

Attribute selection with fuzzy decision reducts

Inform. Sci.

(2010)

ShenQ. et al.

Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring

Pattern Recognit.

(2004)

JensenR. et al.

Fuzzy-rough data reduction with ant colony optimization

Fuzzy Sets and Systems

(2005)

BhattR.B. et al.

On fuzzy-rough sets approach to feature selection

Pattern Recognit. Lett.

(2005)

BhattR.B. et al.

On the compact computational domain of fuzzy-rough sets

Pattern Recognit. Lett.

(2005)

HuQ.H. et al.

Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation

Pattern Recognit.

(2007)

GanivadaA. et al.

Fuzzy rough sets, and a granular neural network for unsupervised feature selection

Neural Netw.

(2013)

ZengA.P. et al.

A fuzzy rough set approach for incremental feature selection on hybrid information systems

Fuzzy Sets and Systems

(2015)

WangC.Z. et al.

Feature subset selection based on fuzzy neighborhood rough sets

Knowl.-Based Syst.

(2016)

WeiW. et al.

Fuzzy rough approximations for set-valued data

Inform. Sci.

(2016)

LiY.W. et al.

Feature selection for multi-label learning based on kernelized fuzzy rough sets

Neurocomputing

(2018)

ZhangX. et al.

A fuzzy rough set-based feature selection method using representative instances

Knowl.-Based Syst.

(2018)

SheejaT. et al.

A novel feature selection method using fuzzy rough sets

Comput. Ind.

(2018)

NiP. et al.

PARA: A positive-region based attribute reduction accelerator

Inform. Sci.

(2019)

NiP. et al.

Incremental feature selection based on fuzzy rough sets

Inform. Sci.

(2020)

XuJ.C. et al.

Feature genes selection using fuzzy rough uncertainty metric for tumor diagnosis

Comput. Math. Methods Med.

(2019)

ZadehL.A.

Probability measures of fuzzy events

J. Math. Anal. Appl.

(1968)

HuQ.H. et al.

Information-preserving hybrid data reduction based on fuzzy-rough techniques

Pattern Recognit. Lett.

(2006)

WangC.Z. et al.

Uncertainty measures for general fuzzy relations

Fuzzy Sets and Systems

(2019)

Cited by (62)

Feature selection for classification with Spearman's rank correlation coefficient-based self-information in divergence-based fuzzy rough sets
2024, Expert Systems with Applications
Feature selection facilitates uncertainty disposal and information mining, and it has received widespread research interests. Divergence-based fuzzy rough sets (Div-FRSs), a new kind of fuzzy rough sets, have been applied to feature selection and induced two efficient algorithms, FS-AFS and FS-RFS. Nevertheless, FS-AFS and FS-RFS still have advancement space, because the dependency functions only focus on lower approximation and ignore the uncertainty in upper approximation, which will certainly undermine the algorithmic evaluation effects. To this end, this paper introduces the upper approximation and fuses it with lower approximation via class-specific pattern and Spearman coefficient to construct a new information measurement for metric perfection, called Spearman-based self-information. Relying on this new measurement, a novel feature selection algorithm SPESI is established to improve FS-AFS and FS-RFS. At first, Spearman coefficient is introduced and upper approximation is defined. Second, class-driven precision and roughness are built by incorporating Spearman coefficient-based class-specific weight vectors with class-driven lower and upper approximations. Meanwhile, the granulation monotonicity of newly-defined measurements is also explored. Then, the core measurement Spearman-based self-information is firstly given and its feature significance motivates a feature selection algorithm SPESI with heuristic search. Finally, data experiments are implemented to validate the effectiveness of SPESI, and a conclusion can be drawn that SPESI outperforms FS-AFS and FS-RFS to acquire better classification performances with fewer feature numbers and less running time.
Variable precision fuzzy rough sets based on overlap functions with application to tumor classification
2024, Information Sciences
Overlap functions, which can be characterized as a type of non-associative binary aggregation operators, have emerged as one of the most extensively utilized aggregation operators in numerous applications, including image processing, information fusion, and classification problems. At the same time, fuzzy rough sets have also been widely used in these fields due to their excellent ability to handle continuous and uncertain information. However, the variable precision fuzzy rough set model based on overlap functions and its applications have not been fully studied. For instance, some basic properties are invalid and there is a lack of practical applications. In this paper, both overlap functions and precision parameters are introduced into the fuzzy rough sets, namely the overlap function-based variable precision fuzzy rough set (OVPFRS), which is then used in the practical problem of tumor classification. First, considering the existing overlap function-based rough set models, the OVPFRS model is established, and some underlying properties of this model are explored. Second, on the basis of the proposed model, a method for attribute reduction is developed. Finally, the new method is applied to the classification of tumor data from the real world. Through experimentation and comparison with other attribute reduction methods, it has been demonstrated that our model is flexible and the algorithm is viable and effective.
Fusing multi-scale fuzzy information to detect outliers
2024, Information Fusion
Outlier detection aims to find objects that behave differently from the majority of the data. Existing unsupervised approaches often process data with a single scale, which may not capture the multi-scale nature of the data. In this paper, we propose a novel information fusion model based on multi-scale fuzzy granules and an unsupervised outlier detection algorithm with the fuzzy rough set theory. First, a multi-scale information fusion model is formulated based on fuzzy granules. Then we employ fuzzy approximations to define the outlier factor of multi-scale fuzzy granules centered at each data point. Finally, the outlier score is calculated by aggregating the outlier factors of a set of multi-scale fuzzy granules. Experimental results demonstrate that the proposed method is comparable with or better than the leading outlier detection methods. The codes and datasets are publicly available online at https://github.com/ChenBaiyang/MFIOD.
Hypergraph-based attribute reduction of formal contexts in rough sets
2023, Expert Systems with Applications
The process of attribute reduction is a critical aspect of rough set theory as applied to data analysis. Several methods for attribute reduction have been outlined in previous studies. For instance, hypergraph theory has shed light on this area. However, most existing methods of attribute reduction focus on theories, leading to complexities when searching for attribute reducts due to an absence of visuality. Moreover, although certain existing methods of attribute reduction are used by certain visual methods, such as that of combining hypergraphs, none of them acknowledge the uniqueness of the relationship between hypergraphs and the corresponding information systems. This flaw results in accurate results for a specific information system under consideration in a particular paper, but unreliable results for another system. Consequently, it impedes the promotion of attribute reduction in rough set theory. Thus, investigating a visual method free of this limitation, to accomplish attribute reduction, emerges as a task that requires immediate resolution in rough set theory. This paper addresses this problem for a conventional information system (formal context). First, this paper illustrates the process of constructing a hypergraph from a formal context and vice versa. The two constructions are verified to be unique under isomorphisms. We found that our method exhibits superior time complexity compared to certain methods that constructed basic graphs from certain information systems. Second, under the Pawlak umbrella of classical rough set model for a formal context, this study constructs a family $P$ of equivalence relations on the set of objects. With the aid of a diagram of the relevant hypergraph within the formal context, it presents a method for determining the dispensability of each element in $P$ . Third, guided by the relevant hypergraph of the formal context, an approach for investigating reducts of $P$ and the attribute set is proposed. Compared with certain methods of attribute reduction in rough sets using hypergraphs, the approach proposed in this paper is superior in certain aspects. Finally, the obtained results are verified and demonstrate practical applications through an example of biological classification. The use of hypergraph diagrams in the proposed methodology enhances its visibility, therefore contributing to the enrichment of attribute reduction in rough set theory. This is beneficial for the promotion and application of the resulting achievements.
A novel fuzzy-rough attribute reduction approach via local information entropy
2023, Fuzzy Sets and Systems
Attribute reduction has become an essential challenge in the fields of pattern recognition, data mining and knowledge discovery. As a good indicator of the correlation between variables, information entropy has been widely used as a measure in several attribute reduction algorithms. Its calculation information only comes from the lower approximation, and other information in the calculation process is usually ignored. At the same time, due to the need to consider all objects when using traditional information entropy calculation, attribute reduction is time-consuming and may cause overfitting problems, so it is urgent to improve its computational efficiency and avoid overfitting. Therefore, a new information entropy, local information entropy, is defined. Experiments show that the local information entropy can further improve the computational efficiency of attribute reduction, and will not significantly decrease the classification precision.
Combinatorial online high‐order interactive feature selection based on dynamic graph convolution network
2023, Signal Processing
Traditional feature selection algorithms assume that the sample and feature space is known before learning, while most of the data is feature streams or data streams in reality. Currently, streaming feature selection algorithms can retain relevant features by removing redundant and irrelevant features based on the interaction between features, but they ignore the specific number of features that have interaction. Most of the existing studies only consider the case of interaction between two features, which is not quite in line with most realistic scenarios, i.e., the number of features with interaction is unknown. This paper concentrates on the high-order interactions between stream feature, and proposes a Combinatorial High-Order Interactive Feature Selection based on Dynamic GCN and Sparse learning (CHOIFS-DGS). Based on previous definitions of feature interaction, this paper proposes some new metrics to measure the degree of interaction between a newly arrived feature and an already selected feature. CHOIFS-DGS consists of three main parts, namely: low-order online feature selection based on interaction measure, high-order online feature selection based on dynamic GCN, and Intra-group sparse feature selection. In the experimental analysis section, this study employs two different classifiers and eleven publicly released data sets, including gene data related to diseases and data from two classification challenges (NeurIPS 2003 feature selection challenge and WCCI 2006 Performance Prediction Challenge). The experimental results demonstrate that the proposed CHOIFS-DGS model significantly improves classification accuracy on all eleven data sets, while using a relatively smaller number of features, thus fulfilling the role of key feature selection. Furthermore, the CHOIFS-DGS algorithm consists of three components: LO-OIFS, HO-OIFS, and Group-Sparse. By applying these three sub-modules separately for extracting data features and comparing the results with CHOIFS-DGS, it is found that the performance of CHOIFS-DGS is lower than that of the individual sub-modules only in three data sets, while significantly better in the remaining eight data sets. This indicates that the integrated use of the three sub-modules can enhance model accuracy. Finally, in the ablation experiment, to verify the necessity of considering higher-order interactions among features, the results of the HO-OIFS module were compared with those of the other modules. The results show that the model’s accuracy significantly improves after incorporating the HO-OIFS module, thereby demonstrating that considering higher-order interactions between features is essential.

View all citing articles on Scopus

View full text

Attribute reduction methods in fuzzy rough set theory: An overview, comparative experiments, and new directions

Highlights

Abstract

Introduction

Section snippets

Preliminaries

Attribute reduction based on FRSs

Discussion

Experiments

Some remarks on new research directions

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Expert Syst. Appl.

Appl. Soft Comput.

Inform. Sci.

Inf. Fusion

Fuzzy Sets and Systems

Inform. Sci.

Fuzzy Sets and Systems

Inform. Sci.

Fuzzy Sets and Systems

Inform. Sci.

Image Vis. Comput.

Knowl.-Based Syst.

Inform. Sci.

Artificial Intelligence

Pattern Recognit.

Fuzzy Sets and Systems

Inform. Sci.

Inform. Sci.

Internat. J. Approx. Reason.

Fuzzy Sets and Systems

Fuzzy Sets and Systems

Fuzzy Sets and Systems

Fuzzy Sets and Systems

Knowl.-Based Syst.

Expert Syst. Appl.

Inform. Sci.

Knowl.-Based Syst.

Appl. Soft Comput.

Inform. Sci.

Pattern Recognit.

Fuzzy Sets and Systems

Pattern Recognit. Lett.

Pattern Recognit. Lett.

Pattern Recognit.

Neural Netw.

Fuzzy Sets and Systems

Knowl.-Based Syst.

Inform. Sci.

Neurocomputing

Knowl.-Based Syst.

Comput. Ind.

Inform. Sci.

Inform. Sci.

Comput. Math. Methods Med.

J. Math. Anal. Appl.

Pattern Recognit. Lett.

Fuzzy Sets and Systems