Attribute reduction methods in fuzzy rough set theory: An overview, comparative experiments, and new directions
Introduction
In recent years, rough set theory has received extensive attention in many fields such as data mining, pattern recognition, and machine learning. As a mathematical framework that can handle uncertainty data, the classical rough set (CRS) model has been successfully applied to attribute reduction (feature selection), rule extraction, and uncertainty reasoning [1], [2], [3], [4], [5], [6], [7]. Equivalence relations is used in CRS, which is only suitable for categorical (nominal) attributes. In fact, categorical and numerical attributes (i.e., mixed attribute) often coexist in real-life databases, such as attributes that exist in medical analysis and fault diagnosis [8]. One of the feasible methods to deal with numerical attributes is to discretize the numerical attributes. However, discretization is usually an important reason for information loss. In response to the above-mentioned problems, Dubois and Prade proposed Fuzzy Rough Set (FRS) model [9], [10], which provides an effective way to overcome the problem of discretization in real-value data and can be directly applied to numerical or mixed features.
The existing researches on FRS theory can be roughly divided into two topics: the construction of approximate space and the application of methods. On the one hand, a series of extensions on FRSs [11], [12], [13], [14], [15] are investigated according to the different requirements or application scenarios. On the other hand, FRSs have been successfully applied to many fields, such as attribute reduction [16], [17], [18], [19], rule extraction [20], classification tree induction [21], medical analysis [22], etc. Among them, attribute reduction is one of hot research topics in FRS theory, which is to delete irrelevant or unimportant attributes while keeping the ability to classification unchanged. The attributes given by attribute reduction may further used to construct learning models with better generalization ability and lower computation consumption. According to whether there is decision (label) information, the existing attribute reduction methods can be divided into supervised [23], semi-supervised [24], [25] and unsupervised [26], [27]. In addition, according to different search strategies, attribute reduction methods can also be divided into three types: filter-based method [28], wrapper-based method [29], and embedded-based method [30]. In recent years, two other search strategies have also been studied, namely hybrid method [31] and ensemble [25], [32] method. However, the review literature of attribute reduction methods based on FRS theory is still rare. Recently, in [33], authors reviewed the feature selection based on FRS theory, and they divided it into two types of methods: fuzzy dependency-based and fuzzy discernibility matrix-based method. In fact, the feature selection method based on FRSs also includes fuzzy uncertainty measure-based methods. What is more, comparative experiments on the related attribute reduction methods are not reported. Therefore, it is necessary to further discuss attribute reduction methods based on FRS theory.
In view of the above discussion, this paper focuses on a comprehensive overview on attribute reduction methods based on FRS theory before January 1, 2021. First, the FRS theory is reviewed. Then, in order to clarify the application of FRS theory in attribute reduction, all methods are summarized from six aspects including data sources, preprocessing methods, fuzzy similarity metrics, fuzzy operations, reduction rules, and evaluation methods. Among them, attribute reduction rules are divided into three categories: (1) fuzzy dependency-based, (2) fuzzy uncertainty measure-based, and (3) fuzzy discernibility matrix-based. Finally, these three types of attribute reduction rules are compared and analyzed through experiments. The experimental results clarify that these three types of methods can retain fewer attributes and improve or maintain the accuracy of a classifier. Besides, statistical hypothesis test is conducted to further evaluate the statistical difference of these methods. The results show that these algorithms are statistically significantly different. Further, some research directions in the future are discussed. The research framework of attribute reduction method based on FRS theory is shown in Fig. 1.
The rest of this paper is organized as follows. In Section 2, we briefly introduce the preliminaries about FRS theory. In Section 3, attribute reduction methods based on FRSs theory are summarized from six aspects. Discussions on these methods are carried out in Section 4. The analyses of the comparative experiments are presented in Section 5. Some future research directions are discussed in Section 6. Finally, Section 7 summarizes the paper.
Section snippets
Preliminaries
This section reviews some definitions and symbols of FRSs [8], [9], [13], [15].
A Fuzzy Information System (FIS) is a quadruple , where is a non-empty finite set of objects, called universe of discourse (universe); is a non-empty finite set of attributes; is a union of attribute domain, i.e., , where is the attribute domain of the attribute ; is an information function that satisfies for and . When and , the FIS is called the
Attribute reduction based on FRSs
In 1992, Kuncheva first proposed a feature selection method by using FRS [38]. The weak fuzzy division was used to define the fuzzy positive, negative and boundary regions, and then the idea of feature selection was proposed. However, since FRS was just born at that time, they did not attract sufficient attention. In 2004, Jensen et al. proposed an attribute reduction method based on FRSs [39]. Since then, attribute reduction based on FRS has received widespread attention. According to the
Discussion
First, Table 4, Table 5, Table 6 give the detailed literature of fuzzy dependency-based, fuzzy uncertainty measure-based, and fuzzy discernibility matrix-based methods, respectively, where “–” indicates that the method used is not explicitly given.
Through Table 4, Table 5, Table 6, some comparative analyses are summarized as follows.
- (1)
From the perspective of data sources, the data sources in most documents is Link (1);
- (2)
From the perspective of preprocessing methods, most of the methods given are
Experiments
In this section, in order to test the performance of the above attribute reduction methods, Fuzzy Information Entropy-based (FIE) [89], Fuzzy Complement Entropy-based Feature Selection (FCEFS) [101], Fuzzy Rough-based Feature selection (FRFS) [41], Fuzzy Discernibility Matrix-based Feature Selection (FDMFS) [41], Fuzzy Discernibility Matrix-based Attribute Reduction (FDMAR) [111], Variable Precision Fuzzy Rough-based Feature Selection (VPRFS) [68], Fuzzy Neighborhood Rough Set-based (FNRS) [73]
Some remarks on new research directions
Attribute reduction is one of the core contents in FRS theory, and there are still many challenging problems that need to be solved and continued to be studied.
- (1)
Unsupervised or semi-supervised methods. Most existing attribute reduction methods are supervised methods. In real life, there are a lot of data with no or only partial decision information, but obtaining the decision information of the object is a challenging task. Therefore, when there is no or only part of decision information, how to
Conclusion
This paper reviewed the existing attribute reduction methods based on FRS theory. Firstly, the research and development of FRS theory are summarized. Then, in order to clarify the application of FRS models in attribute reduction, all methods are summarized from six aspects. Furthermore, three types of attribute reduction methods are compared and analyzed through experiments. The experimental results showed that the attribute reduction method based on FRSs can retain fewer features and improve
CRediT authorship contribution statement
Zhong Yuan: Conceptualization, Writing - original draft, Read and contributed to the manuscript. Hongmei Chen: Supervision, Project administration, Read and contributed to the manuscript. Peng Xie: Visualization, Read and contributed to the manuscript. Pengfei Zhang: Structure fabrication, Read and contributed to the manuscript. Jia Liu: Structure fabrication, Read and contributed to the manuscript. Tianrui Li: Structure fabrication, Read and contributed to the manuscript.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Nos. 61976182, 62076171, 61876157, and 61976245), the Key Techniques of Integrated Operation and Maintenance for Urban Rail Train Dispatching Control System based on Artificial Intelligence (2019YFH0097), and Sichuan Key R&D project (2020YFG0035).
References (152)
- et al.
Learning a coverage set of maximally general fuzzy rules by rough sets
Expert Syst. Appl.
(2000) - et al.
Dimensionality reduction based on rough set theory: A review
Appl. Soft Comput.
(2009) - et al.
A new measure of uncertainty based on knowledge granulation for rough sets
Inform. Sci.
(2009) - et al.
Multi-source information fusion based on rough set theory: A review
Inf. Fusion
(2021) - et al.
Putting rough sets and fuzzy sets together
- et al.
A comparative study of fuzzy rough sets
Fuzzy Sets and Systems
(2002) - et al.
An axiomatic characterization of a fuzzy generalization of rough sets
Inform. Sci.
(2004) On the T-transitivity of kernels
Fuzzy Sets and Systems
(2006)- et al.
Generalized fuzzy rough sets determined by a triangular norm
Inform. Sci.
(2008) - et al.
Robust fuzzy rough classifiers
Fuzzy Sets and Systems
(2011)
Learning fuzzy rules from fuzzy samples based on rough set technique
Inform. Sci.
Fuzzy rough sets hybrid scheme for breast cancer detection
Image Vis. Comput.
Rough set based semi-supervised feature selection via ensemble selector
Knowl.-Based Syst.
Unsupervised fuzzy-rough set-based dimensionality reduction
Inform. Sci.
Consistency-based search in feature selection
Artificial Intelligence
Embedded unsupervised feature selection
Feature selection in mixed data: A method using a novel fuzzy rough set-based information entropy
Pattern Recognit.
Approximation of fuzzy concepts in decision making
Fuzzy Sets and Systems
Generalized fuzzy rough sets
Inform. Sci.
Constructive and axiomatic approaches of fuzzy approximation operators
Inform. Sci.
Gaussian kernel based fuzzy rough sets: Model, uncertainty measures and applications
Internat. J. Approx. Reason.
Fuzzy rough sets: application to feature selection
Fuzzy Sets and Systems
Fuzzy–rough attribute reduction with application to web categorization
Fuzzy Sets and Systems
Fuzzy-rough feature selection accelerator
Fuzzy Sets and Systems
A graph approach for fuzzy-rough feature selection
Fuzzy Sets and Systems
Attribute reduction for multi-label learning with fuzzy rough set
Knowl.-Based Syst.
Hybrid data-driven outlier detection based on neighborhood information entropy and its developmental measures
Expert Syst. Appl.
Rough set approach to incomplete information systems
Inform. Sci.
Unknown attribute values in induction
Fuzzy rough set-based attribute reduction using distance measures
Knowl.-Based Syst.
Uncertainty measures for fuzzy relations and their applications
Appl. Soft Comput.
Attribute selection with fuzzy decision reducts
Inform. Sci.
Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring
Pattern Recognit.
Fuzzy-rough data reduction with ant colony optimization
Fuzzy Sets and Systems
On fuzzy-rough sets approach to feature selection
Pattern Recognit. Lett.
On the compact computational domain of fuzzy-rough sets
Pattern Recognit. Lett.
Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation
Pattern Recognit.
Fuzzy rough sets, and a granular neural network for unsupervised feature selection
Neural Netw.
A fuzzy rough set approach for incremental feature selection on hybrid information systems
Fuzzy Sets and Systems
Feature subset selection based on fuzzy neighborhood rough sets
Knowl.-Based Syst.
Fuzzy rough approximations for set-valued data
Inform. Sci.
Feature selection for multi-label learning based on kernelized fuzzy rough sets
Neurocomputing
A fuzzy rough set-based feature selection method using representative instances
Knowl.-Based Syst.
A novel feature selection method using fuzzy rough sets
Comput. Ind.
PARA: A positive-region based attribute reduction accelerator
Inform. Sci.
Incremental feature selection based on fuzzy rough sets
Inform. Sci.
Feature genes selection using fuzzy rough uncertainty metric for tumor diagnosis
Comput. Math. Methods Med.
Probability measures of fuzzy events
J. Math. Anal. Appl.
Information-preserving hybrid data reduction based on fuzzy-rough techniques
Pattern Recognit. Lett.
Uncertainty measures for general fuzzy relations
Fuzzy Sets and Systems
Cited by (62)
Feature selection for classification with Spearman's rank correlation coefficient-based self-information in divergence-based fuzzy rough sets
2024, Expert Systems with ApplicationsVariable precision fuzzy rough sets based on overlap functions with application to tumor classification
2024, Information SciencesFusing multi-scale fuzzy information to detect outliers
2024, Information FusionHypergraph-based attribute reduction of formal contexts in rough sets
2023, Expert Systems with ApplicationsA novel fuzzy-rough attribute reduction approach via local information entropy
2023, Fuzzy Sets and Systems