MinReduct: A new algorithm for computing the shortest reducts
Introduction
Reducts from Rough Set Theory (RST) [19], [20], [21], [22] are minimal subsets of attributes preserving the discernibility capacity of the whole set of attributes [22]. The use of reducts for feature selection has been deeply studied [2], [11], [17]. Reducts are useful in multi-objective cost-sensitive attribute reduction [29], [33], attribute relevance evaluation [7] and classification [8], [18] among others.
Computing all reducts of a dataset is NP–hard [28]. Thus, several approximate algorithms for reduct computation have been developed [4],[5]. Among them, some generate a subset of reducts and others find approximate solutions instead of reducts. There are also several algorithms developed for computing just a single reduct [9], [10], [34], [37].
Generating the complete set of reducts for a dataset requires a high computational effort, and it results, most of the times, in a large amount of reducts. However, in practice, usually only a subset of reducts that satisfy some additional restrictions is needed [10], [31], [35]. A special case is the computation of all the reducts with the minimum length (the shortest reducts). This subset is a representative sample of all reducts [30]. Shortest reducts are especially useful, for instance, in data reduction applications and classification [36]. Unfortunately, the problem of computing all the shortest reducts of a decision system is also NP-hard [28].
One of the early works reported for computing all the shortest reducts was presented in [30]. In this work, after a strong discussion on reduct computation and the role of the shortest reducts within RST applications, two algorithms were presented. The first one for computing all k-reducts (reducts with length no greater than k) and the second one to compute all the shortest reducts (SRGA). Both algorithms were built on top of the Modified Reduct Generation Algorithm (MRGA), which is the core proposal of the author. MRGA introduces the application of absorption laws over the discernibility function, which is a representation of the discernibility information; this allows reducing the runtime in comparison to previous algorithms. However, in this approach, every candidate is evaluated looking for superfluous attributes. This is an operation with a high computational cost, which reduces the performance of the algorithm in some cases.
In [16], a heuristic approach to reduce the runtime of computing the shortest reducts is presented. This heuristic consists in finding a single short reduct and then, using its length to limit the search space by considering only those attribute combinations with no higher length. The main drawback of this algorithm is that the second step searches for reducts with no additional pruning strategies; thus exploring all possible attribute combinations, which is infeasible in most cases.
The algorithm proposed in [36] (CAMARDF) is an elaborated approach to the problem of computing all the shortest reducts of a dataset. This algorithm operates over the reduced discernibility function which is a representation of the discernibility information after applying absorption laws. This representation is related to inefficient candidate evaluation procedures. Furthermore, CAMARDF sorts the attributes to be processed at each recursion level by their significance. In this way, those attributes that discern more pairs of objects confused by the current candidate, are added first. This strategy reduces the search space; however, computing the attribute significance has a high computational cost.
In this work, we introduce a new algorithm, called MinReduct, for computing all the shortest reducts. For developing this new algorithm, some of the most effective pruning properties used in state of the art algorithms for computing all reducts [15], [24], [26] are adapted to compute only the shortest reducts. MinReduct uses a fast candidate evaluation procedure over an optimized data representation. Unlike SRGA, MinReduct evaluates candidates using low-cost operations, and relies on limiting the candidate size as the main pruning strategy. CAMARDF uses a sorting process for adding new attributes to a candidate, which reduces the number of evaluated candidates regarding the predefined traversing order followed by MinReduct. However, we will show experimentally over synthetic and real-world datasets that our proposal of combining low-cost operations and a candidate size-based pruning strategy, which is the main contribution of MinReduct, reduces the runtime in most cases.
It is known that there is a kind of isomorphism or dual relationship between reducts from Rough Set Theory and irreducible testors from Test Theory [12]. In our proposal, we combine ideas and concepts introduced in algorithms that were designed to compute irreducible testors, such as LEX [27], CT-EXT [25] and BR [14],[15]. It should be noted that, in the same way, the problem we are addressing has been also addressed from the graph theory [1].
Section snippets
Basic concepts
In this section, we present some basic concepts from Rough Set Theory (RST) in order to provide the theoretical basis for understanding the proposed algorithm.
Decision Systems are a basic representation of information in RST. A decision system (DS) is a table with rows representing objects while columns represent attributes. We denote by U a finite non-empty set of objects U={x1, x2, …, xn} and A is a finite non-empty set of attributes. For every attribute in A there is a mapping: a: U→Va. The
Proposed algorithm
In this section we provide first, in the Section 3.1, the theoretical basis needed for introducing the proposed algorithm, which is presented in the Section 3.2.
Evaluation and discussion
In this section, a comparative analysis of the proposed algorithm (MinReduct) versus SRGA [30] and CAMARDF [36] is presented. We selected SRGA and CAMARDF because they are the fastest algorithms in the state of the art. Since all these algorithms have exponential complexity, we perform a comparison through their implementations. For our experiments, we have implemented MinReduct and SRGA in Java, and we used the authors’ implementation of CAMARDF in C.
Evaluations are performed over synthetic
Conclusions
In this work, we introduced a new algorithm, MinReduct, for computing all the shortest reducts of a dataset. Previous algorithms, reported in the literature, operate over the discernibility function and relay on costly operations for generating and evaluating candidates, while the proposed algorithm uses binary cumulative operations and a fast candidate evaluation process.
From our experiments over synthetic basic matrices and real–world datasets from the UCI machine learning repository, we can
Authorship confirmation
This manuscript, or a large part of it, has not been published, was not, and is not being submitted to any other journal. If presented at or submitted to or published at a conference(s), the conference(s) is (are) identified and substantial justification for re-publication is presented below. A copy of conference paper(s) is(are) uploaded with the manuscript.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (37)
- et al.
A rough set approach to feature selection based on ant colony optimization
Pattern Recognit. Lett.
(2010) - et al.
Finding rough set reducts with fish swarm algorithm
Knowl. Based Syst.
(2015) - et al.
Finding rough and fuzzy-rough set reducts with SAT
Information Sciences
(2014) - et al.
A Boolean function approach to feature selection in consistent decision information systems
Expert Syst. Appl.
(2011) - et al.
On the relation between rough set reducts and typical testors
Information Sciences
(2015) - et al.
An overview of the evolution of the concept of testor
Pattern Recognit.
(2001) - et al.
A new algorithm for reduct computation based on gap elimination and attribute contribution
Information Sciences
(2018) - et al.
Minimal decision cost reduct in fuzzy decision-theoretic rough set model
Knowl. Based Syst.
(2017) - et al.
An enhancement for heuristic attribute reduction algorithm in rough set
Expert Syst. Appl.
(2014) - et al.
On the relation between the concepts of irreducible testor and minimal transversal
IEEE Access.
(2019)
Feature selection based on rough sets and minimal attribute reduction algorithm
Int. J. Hybrid Inf. Technol.
Analyzing the impact of the discretization method when comparing Bayesian classifiers
Attribute importance degrees corresponding to several kinds of attribute reduction in the setting of the classical rough sets
Fuzzy Sets, Rough Sets, Multisets and Clustering
Minimal attribute reduction with rough set based on compactness discernibility information tree
Soft Computing
BR: a new method for computing all typical testors
An algorithm for computing typical testors based on elimination of gaps and reduction of columns
Int. J. Pattern Recognit. Artif. Intell.
Cited by (10)
Algorithm for computing all the shortest reducts based on a new pruning strategy
2022, Information SciencesCitation Excerpt :In this section, we present a comparative experiment between our proposed algorithm, MinReduct and MiLIT using 30 real-world datasets taken from [28–30]. We selected 30 datasets, including some used by [22,23] to evaluate the efficiency of the proposed algorithm and also where the SBDMs have different densities, to depict the runtime difference among the three algorithms when the density of the SBDM varies. For numerical attributes, we used the Weka’s equal width discretization method with 10 bins, as described in [31].
Feature selection based on fuzzy-neighborhood relative decision entropy
2021, Pattern Recognition LettersCitation Excerpt :A good approach of feature selection can pick out representative features to retain a suitably high accuracy [14], but there is a key issue how to mine an effective feature evaluation measure, which usually requires the attribute-based granulation monotonicity for heuristic search. Rough set theory has been proven to be an effective methodology for feature selection, and its multiple forms of attribute reduct and feature significance are utilized [4,23,16,20]. Classical rough sets mainly adhere to categorical data; when pursuing continuous data, they can resort to discretization, but this process may cause the time increase and information loss.
Virtual special issue on novel data-representation and classification techniques
2021, Pattern Recognition LettersApproaches for coarsest granularity based near-optimal reduct computation
2023, Applied IntelligenceA New Approach for Optimal Selection of Features for Classification Based on Rough Sets, Evolution and Neural Networks
2023, Lecture Notes in Networks and SystemsCoarsest granularity-based optimal reduct using A <sup>*</sup> search
2023, Granular Computing