Stepwise optimal scale selection for multi-scale decision tables via attribute significance
Introduction
Granular computing (GrC), which derives from the topic of fuzzy information granulation firstly proposed by Zadeh in 1979 [57], [58], is employed as a powerful tool for complex problem solving, massive data mining and fuzzy information processing. Several decades have witnessed the rapid development of GrC [1], [2], [16], [19], [23], [24], [27], [33], [34], [43], [45], [47], [48], [49], [51], [55], [59], [60]. As a primitive notion, granule is a clump of objects drawn together by the criteria of indistinguishability, similarity or functionality [58]. Satisfying a given specification, these elements within a granule are considered as a whole rather than individuals. Therefore, with respect to a particular level of granularity, a universe can be represented by a set of granules. This process is called information granulation, which provides an effective approach to solve a complex problem at a certain level of granulation. Partition model proposed by Yao [52], severing as a significant and commonly used model for GrC, is constructed by granulating a finite universe of discourse through a family of pairwise disjoint subsets under an equivalence relation. Furthermore, Bittner and Stell [3], Yao [49], Wu and Leung [41] have studied the multiple granulation hierarchies. Recently, Xu et al. [46] and Hu et al. [13] studied information fusion and machine learning from the viewpoint of GrC, respectively.
Rough set theory (RST) originally proposed by Pawlak [28] has played a vital role in the extension and development of GrC. As a powerful tool of soft computing, it is able to perform well in construction, interpretation and representation of granules in a universe by an equivalence relation, and provide us with more precise concept to define and analyze notions of GrC. From the view of GrC, equivalence granules can be obtained in Pawlak approximation space based on an equivalence relation, and are the basic components for representing and approximating in Pawlak approximation space.
Some extensions of RST about acquisition of knowledge from information table via an objective knowledge induction process have been successively proposed, such as probabilistic rough set [38], [39], [40], [50], [53], [54], dominance-based rough set [4], [6], [7], [20], [36], multigranulation rough set [10], [11], [12], [14], [21], [25], [26], [30], [31], [32], [56], etc. In these literatures, the information table characterized by only taking on one value for each object at each attribute is called a single-scale information table (SSIT). In a SSIT, an equivalence relation is determined by a subset of attributes and can granulate the universe of discourse into equivalence granules. And the inclusion relation between subsets of attributes implies a coarse or fine relation for granules, which induces a multi-layered granulation structure on the universe. However, objects are usually measured at different scales under the same attribute [17]. For example, many values can be taken under the same attribute for the same object when it is measured at different scales, which induces hierarchical structure for knowledge acquisition. In order to deal with such problem, Wu and Leung [41] proposed a special information table called multi-scale information table (MSIT), in which data are represented by different scales at different levels of granulations having a granular information transformation from a finer to a coarser labelled partition [42].
A simple example is that a point of 40°44′54.74′′N, 73°59′10.88′′W can be located at Northern Hemisphere or Western hemisphere in a coarser granule, or at United State in a coarse granule, or at New York in a fine granule, or at Manhattan in a finer granule; in fact, it is the approximate geographic coordinate of the Empire State Building. For a given subset of attributes, two different scale combinations may induce a kind of granules being either a refinement or a coarsening of the others. Hierarchically structured data usually contain a lot of useful information about objects that we are interested in, but they also mean redundancy. Therefore, the key idea of MSIT is subsystems constitution or SSITs extraction in terms of attributes restricted on their own some scales, then some operations of attribute reduction and knowledge acquirement will be implemented in a proper SSIT that we select as some rule. Thus a proper decision table decomposing from MSIT is an important issue in processing multi-scale information table. In fact, finer granules mean more cost, but coarse granules may fail to capture some useful information, thus an appropriate level of granulation should be selected to approximate subsets of the universe of discourse. As an extreme, if a decision table is obtained that all attributes are selected on their finest scales, these equivalence granules are the finest and are able to capture the most information in knowledge acquirement. However, this may cause many economic losses, because promotion of accuracy of the measurement means more cost, and in fact we can also get the same classification effect with some attributes not necessarily in the finest scales. As another extreme, if a decision table is obtained that all attributes are selected on their coarsest scales, these equivalence granules are the coarsest and will perform badly in decision making. Thus optimal scale selection is a critical issue in processing MSIT. If S is an MSIT, then the basic process for S is extracted as follows.
- •
Based on the levels of scales of attributes, S is decomposed into many SSITs through each attribute on its some scale.
- •
According to given rules, an appropriate decision table is selected for decision making among these SSITs.
- •
In the selected subsystem, the operations of attribute reduction and rules extraction can be done.
In an MSIT, for any attribute, someone may take a refining attribute value as a revision of a coarsening attribute value, then an MSIT can be treated as a dynamic updating information system over time, and some incremental learning algorithms can be used for attribute reduction [15], [37]. However, in this case, the original static hierarchically structured information in MSIT will be replaced by dynamic monotonous information, in other words, hierarchical information will be lost, so this case is not the focus of the paper. And in this paper, we only consider static approach for scale selection from the view of hierarchical structure, and the dynamic case will be considered in future work.
Wu and Leung assumed that all attributes have the same number of levels of scales and studied the optimal scale selection for the special multi-scale decision tables (MSDTs) [41], [44]. On the same assumption, Gu et al. [8], [9] and She et al. [35] studied the knowledge acquisition and rule induction in MSDTs. Furthermore, Li and Hu extended the theory and application of MSDT of diverse attributes with different numbers of levels of scales [18]. In succession, Wu–Leung model, complement model and lattice model proposed in [18], [42] have good performances to solve the problem of optimal scale selection. In fact, Wu–Leung model [42] mainly study optimal scale selection for multi-scale decision tables on the previous assumption from the view of the standard rough set model and a dual probabilistic rough set model, while complement model and lattice model [18] study the general case, and the differences between them have been discussed in [18]. In this paper, we always study the general case based on literature [18]. In particular, lattice model is able to successfully find all the optimal scale combinations from all combinations, but it is time-consuming (e.g. see Table 7), so a faster approach is urgently expected to propose. Besides, another problem troubles us, as a special information table, how to define the concept of multi-scale attribute significance for an MSDT, since attribute significance is an important property in processing decision table. Motivated by these, in this paper, we extend the attribute significance in SSDTs to multi-scale attribute significance in MSDTs, and give another two equivalent definitions in the sense of binary classification. Furthermore, based on the notion of multi-scale attribute significance, we propose a novel approach of stepwise optimal scale selection to get one optimal scale combination in MSDTs. Compared with lattice model in [18], the new approach has a great advantage with less time cost (e.g. see Table 7). Finally, real-life experiments are employed to illustrate its feasibility and efficiency.
The remainder parts of the paper are organized as follows. In Section 2, several basic notions of Pawlak’s rough set, information system are reviewed. In Section 3, some concepts of multi-scale information tables and multi-scale attribute significance are introduced. In Section 4, the concept of stepwise optimal selection for consistent and inconsistent MSDTs and another two equivalent definitions of multi-scale attribute significance are proposed. Five algorithms of computing stepwise optimal scale selection are given in Section 5 and six real-life experiments are employed to test these algorithms in Section 6. Finally, we conclude the paper with a summary and outlook of further work in Section 7.
Section snippets
Preliminaries
In this section, we review several basic concepts and introduce some notions about Pawlak’s rough set and information table.
Related to multi-scale information system
In this section, we review some concepts of multi-scale information. Furthermore, we firstly propose the notion of multi-scale attribute significance.
Stepwise optimal scale selection in multi-scale decision table
In this section, we briefly recall the lattice model for computing all optimal scale combinations in an MSDT. Besides, we firstly propose stepwise optimal scale selection for faster computing one optimal scale combination in a consistent multi-scale decision table and in an inconsistent multi-scale decision table.
Algorithms for stepwise optimal scale selection
In this section, we propose some algorithms to compute the stepwise optimal scale combination based on attributes significance in consistent and inconsistent decision tables.
First, Algorithm 1 is used to compute the positive region for a given single-scale decision table. Be similar to Algorithm 1 in [18], the complexity of Algorithm 1 is O(|U|2). Algorithm 2 is designed to calculate the ascendant sort of significances of attributes in a given multi-scale decision table.
Next, we analyze the
Case study
In order to describe the mechanisms of step optimal scale selection more clearly, Example 6.1 is employed to illustrate the detailed processing of the proposed algorithm.
Example 6.1 Find one optimal scale combination presented in Table 6 by Algorithm 5. From the table, we know that the multi-scale decision table is inconsistent, since x4 and x6 are indistinguishable w.r.t. RC but d(x4) ≠ d(x6), where .
Compute the ascendant sort τ of attributes significances. In
Conclusions
Multi-scale information system services as an extension of information system, where varying values can be taken under the same attribute for each object at different levels of scales. Based on the concept of attribute significance in single-scale decision table, multi-scale attribute significance in multi-scale decision table is proposed. At some extent, another two equivalent definitions based on lower approximation distribution and upper approximation distribution in inconsistent multi-scale
Acknowledgements
The authors thank the editors and anonymous reviewers for their most valuable comments and suggestions in improving this paper. This research was supported by the National Natural Science Foundation of China (Grant Nos. 11571010, 61179038).
References (60)
- et al.
Approximate distribution reducts in inconsistent interval-valued ordered decision tables
Inf. Sci.
(2014) - et al.
Rough sets theory for multicriteria decision analysis
Eur. J. Oper. Res.
(2001) Three-way decisions space and three-way decisions
Inf. Sci.
(2014)Three-way decision spaces based on partially ordered sets and three-way decisions based on hesitant fuzzy sets
Knowl. Based Syst.
(2016)- et al.
The aggregation of multiple three-way decision space
Knowl. Based Syst.
(2016) - et al.
Granular computing based machine learning in the era of big data
Inf. Sci.
(2017) - et al.
Intuitionistic fuzzy multigranulation rough sets
Inf. Sci.
(2014) - et al.
Matrix-based dynamic updating rough fuzzy approximations for data mining
Knowl. Based Syst.
(2017) - et al.
A new approach of optimal scale selection to multi-scale decision tables
Inf. Sci.
(2017) - et al.
Concept learning via granular computing: a cognitive viewpoint
Inf. Sci.
(2015)
Incremental update of approximations in dominance-based rough sets approach under the variation of attribute values
Inf. Sci.
UCI machine learning repository
Data Mining, Rough Sets and Granular Computing
Rough sets
Int. J. Comput. Inf. Sci.
Systematic mapping study on granular computing
Knowl. Based Syst.
A local approach to rule induction in multi-scale decision tables
Knowl. Based Syst.
Efficient updating rough approximations with multi-dimensional variation of ordered data
Inf. Sci.
Probabilistic rough sets characterized by fuzzy sets
Comparison of the probabilistic approximate classification and the fuzzy set model
Fuzzy Sets Syst.
Upper and lower probabilities of fuzzy events induced by a fuzzy set-valued mapping
Theory and applications of granular labelled partitions in multi-scale decision tables
Inf. Sci.
Granular computing and knowledge reduction in formal contexts
IEEE Trans. Knowl. Data Eng.
A novel cognitive system model and approach to transformation of information granules
Int. J. Approximate Reasoning
A novel approach to information fusion in multi-source datasets: a granular computing viewpoint
Inf. Sci.
Rough sets, neighborhood systems and granular computing
Proceedings of IEEE Canadian Conference on Electrical and Computer Engineering
Stratfied rough sets and granular computing
Probabilistic approaches to rough sets
Expert Syst.
Quantitative information architecture, granular computing and rough set models in the double-quantitative approximation space of precision and grade
Inf. Sci.
Combining granular computing and RBF neural network for process planning of part features
Int. J.Manuf. Technol.
Cited by (76)
Information fusion for multi-scale data: Survey and challenges
2023, Information FusionOptimal scale selection and knowledge discovery in generalized multi-scale decision tables
2023, International Journal of Approximate ReasoningOptimal scale generation in two-class dominance decision tables with sequential three-way decision
2023, Information SciencesOptimal scale selection based on multi-scale single-valued neutrosophic decision-theoretic rough set with cost-sensitivity
2023, International Journal of Approximate ReasoningA prospect-regret theory-based three-way decision model with intuitionistic fuzzy numbers under incomplete multi-scale decision information systems
2023, Expert Systems with Applications