Compressed binary discernibility matrix based incremental attribute reduction algorithm for group dynamic data
Introduction
Rough set theory [1] is a powerful mathematical tool proposed by Pawlak to deal with imprecise or vague concepts. The application of traditional rough set theory is restricted due to the strict limitation of the equivalence relation. Through combining with fuzzy set, probability theory, and other soft computing methods, rough set theory is widely applied in machine learning, decision analysis, knowledge discovery and so on. Attribute reduction is one of the focal points in rough set theory. The core idea of attribute reduction is to obtain the sub attribute set which can keep the same discriminability with the whole attribute set. However, datasets in real-world applications often vary dynamically over time. Moreover, datasets often expand by group data in many cases. The non-incremental approach in attribute reduction is ineffective as it needs to retrain the expanded datasets as new ones and usually consumes large amounts of computational time and storage space for re-computations. The traditional incremental approaches considering a single dynamic object in each iteration may not be applied to such datasets with group dynamic objects. So it is necessary to develop effective incremental attribute reduction algorithm for group dynamic data.
The information system is mainly composed of objects, attributes and attribute values. Accordingly, in current literature, there are three cases, namely the variations of the attribute set [2], [3], attribute value [4], [5] and the object set [6], [7], are considered in incremental attribute reduction algorithms. The dynamic variation of the attribute set is mainly caused by adding or deleting attributes [8]. The variation of attribute values is mainly divided into two kinds of situations: on the one hand, the original data value is wrong, which needs to be modified, on the other hand, the part of the attribute values in information system needs to be updated with new ones [9]. The dynamic changing of the object set is caused by adding or deleting objects [10]. In the actual industrial production of online data acquisition and processing, attributes and their values are already set in advance, while it is the dynamic expanded object set that should be taken into more consideration.
For the information system with dynamic objects, according to different methods of attribute significance measure, incremental attribute reduction approaches can be divided into approaches based on positive region [11], [12], information entropy [13] and discernibility matrix [7], [14]. In comparison with other approaches, the discernibility matrix is more intuitive and easy to be combined with other intelligent methods. Therefore, attribute reduction methods based on discernibility matrix have captured increasing attentions. Lazo-Cortés et al. proposed an improved attribute reduction algorithm based on binary discernibility matrix [15], which can acquire the attribute reduction through operating on binary discernibility matrix directly. However, it can only be used in a static information system. Yang introduced a method for updating the attribute core incrementally [16]; thereafter, on the basis of obtained core attributes, an incremental attribute reduction algorithm based on discernibility matrix was proposed. The limitation of this algorithm is that the whole discernibility matrix needs to be rebuilt for each iterative computation, which leads to the low efficiency in practice. An efficient incremental attribute reduction algorithm which does not need to store the matrix was presented in the literature [14]. With the known core attribute, the attribute reduction can be further obtained by this method using heuristic information of positive region; however, they didn't show the way to reduce the requirement of storing space. What's more, the attribute reduction can't be obtained rapidly when adding a group of objects together. Xu et al. proposed a binary discernibility matrix based method to get the attribute reduction whether the newly added data is only one single object or multiple objects [17]. But they only use the binary discernibility matrix to establish the programming equation and then obtain the reduction with the method of 0–1 programming rather than the operation on the built binary discernibility matrix. An incremental updating approach to compute core attributes dynamically based on the improved binary discernible matrix was presented in the literature [18]. However, it also needs the large memory space to store the whole binary discernibility matrix.
To the best of our knowledge, the existing algorithms can not effectively reduce the storage space of discernibility matrix and acquire the attribute reduction rapidly for group dynamic data. For the purpose of reducing the storage space consuming on binary discernibility matrix, a compressed binary discernibility matrix is introduced, which can greatly reduce the storage space. Thereafter, an incremental attribute reduction algorithm for group dynamic data based on the compressed binary discernibility matrix is developed. Theoretical analysis, example calculation and experimental simulation verify the validity of the algorithm. There are two contributions in this paper: (1) a compressed binary discernibility matrix is proposed to reduce the storage space effectively; (2) an incremental attribute reduction algorithm for group dynamic data based on the compressed binary discernibility matrix is developed. It will accelerate the processing when adding a group of dynamic objects.
The paper is organized as follow. Section 2 outlines some preliminary knowledge. In Section 3, we present an approach to compress the binary matrix and an incremental approach to get core attributes for group data based on binary discernibility matrix. Next, in Section 4, we contribute an incremental attribute reduction algorithm for group added data. Finally, in Section 5, we present the conclusion and the future work.
Section snippets
Preliminaries
Information and knowledge in rough set theory can be represented in an information system. An information system can be characterized as a decision table, where the information is stored in a table. Each row in the table represents an individual record. Each column represents an attribute of the records or a field.
Definition 1 An information system consists of 4-tuple as follows:
is a non-empty finite set of objects, called the universe. , where is the set of
Incremental attribute core computing
Generally speaking, there often exist some repetitive objects in the data set. It is unnecessary but time-consuming that these redundant objects are involved in computation repeatedly. To avoid some elements rebuilding caused by redundant objects in binary discernibility matrix, the Definition 5 is redefined as follows.
Definition 6 For a given information system S, the binary discernibility matrix is redefined by according to pairs of objects, where exampair is the pair of objects xi and
Incremental attribute reduction for group dynamic data
Attribute reduction not only can eliminate the irrelevant and redundant attributes but reserve the important attributes which play essential roles in the classification problem. It is valuable that the study on incremental attribute reduction algorithm for group dynamic data, which can significantly reduce the time consumption. Proposition 1 [19] Let S be a decision table, and . If x1 and y1 are repetitive objects in U1, x2 and y2 are repetitive in U2, x3 and y3 are inconsistent in U2,
Experimental analysis
In this section, we conduct some numerical experiments to assess the efficiency of our proposed algorithms. The experiments are implemented using MATLAB 2010A on 2.2 GHz AMD Processor with 4GB of RAM and Windows 7 of operation system. For convenience, the incremental attribute reduction algorithm for group dynamic data is denoted by IARA_GD. To illustrate the efficiency of our algorithm, two different discernibility matrix based algorithms are used for comparisons. One is the incremental
Conclusion
Incremental attribute reduction is very important in dynamic data analysis with rough set theory. Traditional incremental algorithms just consider a single dynamic object in each iteration. When a group of dynamic objects arrive at the same time, which often occurs in real life applications, this kind of algorithm is not efficient enough. To update the attribute reduction results efficiently for group dynamic data, an incremental attribute reduction algorithm based on compressed binary
Acknowledgments
This research was supported by the Key Research and Development Program of China (2017YFD0401001), the National Natural Science Foundation of China (61833011, 61403184), the Major Program of the Natural Science Foundation of Jiangsu Province Education Commission (17KJA120001), “Qing Lan” Project of Jiangsu Province (QL2016), and “1311 Talent Plan” of Nanjing University of Posts and Telecommunications (NY2018).
Fumin Ma received her bachelor degree from Henan University, China in 2002, Master degree from Graduate University of Chinese Academy of Sciences in 2005 and the Ph.D. degree from Tongji University, China in 2008. She was a visiting researcher in University College Dublin (UCD) from 2014 to 2015. She is currently a Professor and supervisor for master students at Nanjing University of Finance and Economics. Her research interests include intelligent information processing, process industry
References (20)
- et al.
An incremental attribute reduction approach based on knowledge granularity under the attribute generalization
Int. J. Approx. Reason.
(2016) - et al.
Updating attribute reduction in incomplete decision systems with the variation of attribute set
Int. J. Approx. Reason.
(2014) - et al.
Incremental feature selection based on rough set in dynamic incomplete data
Pattern Recognit.
(2014) - et al.
Incremental update of approximations in dominance-based rough sets approach under the variation of attribute values
Inf. Sci.
(2015) - et al.
Incremental updating approximations in probabilistic rough sets under the variation of attributes
Knowl. Based Syst.
(2015) - et al.
Attribute reduction for dynamic data sets
Appl. Soft Comput.
(2013) - et al.
An incremental approach for attribute reduction based on knowledge granularity
Knowl. Based Syst.
(2016) - et al.
A dynamic attribute reduction algorithm based on 0–1 integer programming
Knowl. Based Syst.
(2011) Rough sets
Int. J. Comput. Inf. Sci.
(1982)- et al.
An incremental algorithm for attribute reduction based on labeled discernibility matrix
Acta Autom. Sin.
(2014)
Cited by (21)
Dynamic three-way neighborhood decision model for multi-dimensional variation of incomplete hybrid data
2022, Information SciencesCitation Excerpt :It can effectively improve the computational performance and facilitate knowledge maintenance through using the previously acquired results. Based on this superiority, much attention has been attracted to incorporate the incremental learning technologies into rough set theory, and a tremendous amount of outstanding achievements have been presented when objects, attributes or attribute values in information systems vary individually [5,6,19,22,24,25,45]. However, under dynamic environments, data may be no longer confined to the single-dimensional variation but to the multi-dimensional variations, namely, objects, attributes or attribute values may evolve over time simultaneously [1,9,10,29–31,34].
Incremental feature selection by sample selection and feature-based accelerator[Formula presented]
2022, Applied Soft ComputingCitation Excerpt :In those works, a more natural incremental scenario is that training samples arrive sequentially. As samples arrive, some incremental rough set-based feature selection methods were well established by incrementally updating discernibility matrix, dependency, information entropy and so on [44–51]. For example, a group incremental feature selection approach was presented in [52] by finding three incremental information entropy measures.
A novel incremental attribute reduction approach for incomplete decision systems
2023, Journal of Intelligent and Fuzzy SystemsRough sets-based tri-trade for partially labeled data
2023, Applied IntelligenceA novel attribute reduction method based on intuitionistic fuzzy three-way cognitive clustering
2023, Applied Intelligence
Fumin Ma received her bachelor degree from Henan University, China in 2002, Master degree from Graduate University of Chinese Academy of Sciences in 2005 and the Ph.D. degree from Tongji University, China in 2008. She was a visiting researcher in University College Dublin (UCD) from 2014 to 2015. She is currently a Professor and supervisor for master students at Nanjing University of Finance and Economics. Her research interests include intelligent information processing, process industry modeling and simulation.
Mianwei Ding received his bachelor degree from Jincheng College of Nanjing University of Aeronautics and Astronautics, China in 2014, Master degree from Nanjing University of Posts and Telecommunications in 2017. His research interests include rough sets, neural computing.
Tengfei Zhang received his bachelor degree from Henan University, China in 2002. And he respectively received master and Ph.D. degree from Shanghai Maritime University, China in 2004 and 2007. He is currently a Professor and supervisor for master students at Nanjing University of Posts and Telecommunications. His research interests include intelligent information processing, complex system modeling and control.