Elsevier

Neurocomputing

Volume 344, 7 June 2019, Pages 20-27
Neurocomputing

Compressed binary discernibility matrix based incremental attribute reduction algorithm for group dynamic data

https://doi.org/10.1016/j.neucom.2018.01.094Get rights and content

Abstract

The datasets in real-world applications often vary dynamically over time. Moreover, datasets often expand by introducing a group of data in many cases rather than a single object one by one. The traditional incremental attribute reduction approaches for a single dynamic object may not be applied to such cases. Focusing on this issue, a compressed binary discernibility matrix is introduced and an incremental attribute reduction algorithm for group dynamic data is developed. The single dynamic object and the group dynamic objects are both considered in this algorithm. According to the dynamic data is a single object or a group of objects, different branches can be chosen to update the compressed binary discernibility matrix. Thereafter, the incremental reduction result can be obtained based on the updated compressed binary discernibility matrix. The validity of this algorithm is demonstrated by simulation and experimental analysis.

Introduction

Rough set theory [1] is a powerful mathematical tool proposed by Pawlak to deal with imprecise or vague concepts. The application of traditional rough set theory is restricted due to the strict limitation of the equivalence relation. Through combining with fuzzy set, probability theory, and other soft computing methods, rough set theory is widely applied in machine learning, decision analysis, knowledge discovery and so on. Attribute reduction is one of the focal points in rough set theory. The core idea of attribute reduction is to obtain the sub attribute set which can keep the same discriminability with the whole attribute set. However, datasets in real-world applications often vary dynamically over time. Moreover, datasets often expand by group data in many cases. The non-incremental approach in attribute reduction is ineffective as it needs to retrain the expanded datasets as new ones and usually consumes large amounts of computational time and storage space for re-computations. The traditional incremental approaches considering a single dynamic object in each iteration may not be applied to such datasets with group dynamic objects. So it is necessary to develop effective incremental attribute reduction algorithm for group dynamic data.

The information system is mainly composed of objects, attributes and attribute values. Accordingly, in current literature, there are three cases, namely the variations of the attribute set [2], [3], attribute value [4], [5] and the object set [6], [7], are considered in incremental attribute reduction algorithms. The dynamic variation of the attribute set is mainly caused by adding or deleting attributes [8]. The variation of attribute values is mainly divided into two kinds of situations: on the one hand, the original data value is wrong, which needs to be modified, on the other hand, the part of the attribute values in information system needs to be updated with new ones [9]. The dynamic changing of the object set is caused by adding or deleting objects [10]. In the actual industrial production of online data acquisition and processing, attributes and their values are already set in advance, while it is the dynamic expanded object set that should be taken into more consideration.

For the information system with dynamic objects, according to different methods of attribute significance measure, incremental attribute reduction approaches can be divided into approaches based on positive region [11], [12], information entropy [13] and discernibility matrix [7], [14]. In comparison with other approaches, the discernibility matrix is more intuitive and easy to be combined with other intelligent methods. Therefore, attribute reduction methods based on discernibility matrix have captured increasing attentions. Lazo-Cortés et al. proposed an improved attribute reduction algorithm based on binary discernibility matrix [15], which can acquire the attribute reduction through operating on binary discernibility matrix directly. However, it can only be used in a static information system. Yang introduced a method for updating the attribute core incrementally [16]; thereafter, on the basis of obtained core attributes, an incremental attribute reduction algorithm based on discernibility matrix was proposed. The limitation of this algorithm is that the whole discernibility matrix needs to be rebuilt for each iterative computation, which leads to the low efficiency in practice. An efficient incremental attribute reduction algorithm which does not need to store the matrix was presented in the literature [14]. With the known core attribute, the attribute reduction can be further obtained by this method using heuristic information of positive region; however, they didn't show the way to reduce the requirement of storing space. What's more, the attribute reduction can't be obtained rapidly when adding a group of objects together. Xu et al. proposed a binary discernibility matrix based method to get the attribute reduction whether the newly added data is only one single object or multiple objects [17]. But they only use the binary discernibility matrix to establish the programming equation and then obtain the reduction with the method of 0–1 programming rather than the operation on the built binary discernibility matrix. An incremental updating approach to compute core attributes dynamically based on the improved binary discernible matrix was presented in the literature [18]. However, it also needs the large memory space to store the whole binary discernibility matrix.

To the best of our knowledge, the existing algorithms can not effectively reduce the storage space of discernibility matrix and acquire the attribute reduction rapidly for group dynamic data. For the purpose of reducing the storage space consuming on binary discernibility matrix, a compressed binary discernibility matrix is introduced, which can greatly reduce the storage space. Thereafter, an incremental attribute reduction algorithm for group dynamic data based on the compressed binary discernibility matrix is developed. Theoretical analysis, example calculation and experimental simulation verify the validity of the algorithm. There are two contributions in this paper: (1) a compressed binary discernibility matrix is proposed to reduce the storage space effectively; (2) an incremental attribute reduction algorithm for group dynamic data based on the compressed binary discernibility matrix is developed. It will accelerate the processing when adding a group of dynamic objects.

The paper is organized as follow. Section 2 outlines some preliminary knowledge. In Section 3, we present an approach to compress the binary matrix and an incremental approach to get core attributes for group data based on binary discernibility matrix. Next, in Section 4, we contribute an incremental attribute reduction algorithm for group added data. Finally, in Section 5, we present the conclusion and the future work.

Section snippets

Preliminaries

Information and knowledge in rough set theory can be represented in an information system. An information system can be characterized as a decision table, where the information is stored in a table. Each row in the table represents an individual record. Each column represents an attribute of the records or a field.

Definition 1

An information system consists of 4-tuple as follows:S=<U,R,V,f>

U={x1,x2,,xn} is a non-empty finite set of objects, called the universe. R=CD, where C={c1,c2,c3,,cm} is the set of

Incremental attribute core computing

Generally speaking, there often exist some repetitive objects in the data set. It is unnecessary but time-consuming that these redundant objects are involved in computation repeatedly. To avoid some elements rebuilding caused by redundant objects in binary discernibility matrix, the Definition 5 is redefined as follows.

Definition 6

For a given information system S, the binary discernibility matrix is redefined by BM*=[exampair,mij*] according to pairs of objects, where exampair is the pair of objects xi and

Incremental attribute reduction for group dynamic data

Attribute reduction not only can eliminate the irrelevant and redundant attributes but reserve the important attributes which play essential roles in the classification problem. It is valuable that the study on incremental attribute reduction algorithm for group dynamic data, which can significantly reduce the time consumption.

Proposition 1

[19] Let S be a decision table, U1=POSC(D) and U2=UPOSC(D). If x1 and y1 are repetitive objects in U1, x2 and y2 are repetitive in U2, x3 and y3 are inconsistent in U2,

Experimental analysis

In this section, we conduct some numerical experiments to assess the efficiency of our proposed algorithms. The experiments are implemented using MATLAB 2010A on 2.2 GHz AMD Processor with 4GB of RAM and Windows 7 of operation system. For convenience, the incremental attribute reduction algorithm for group dynamic data is denoted by IARA_GD. To illustrate the efficiency of our algorithm, two different discernibility matrix based algorithms are used for comparisons. One is the incremental

Conclusion

Incremental attribute reduction is very important in dynamic data analysis with rough set theory. Traditional incremental algorithms just consider a single dynamic object in each iteration. When a group of dynamic objects arrive at the same time, which often occurs in real life applications, this kind of algorithm is not efficient enough. To update the attribute reduction results efficiently for group dynamic data, an incremental attribute reduction algorithm based on compressed binary

Acknowledgments

This research was supported by the Key Research and Development Program of China (2017YFD0401001), the National Natural Science Foundation of China (61833011, 61403184), the Major Program of the Natural Science Foundation of Jiangsu Province Education Commission (17KJA120001), “Qing Lan” Project of Jiangsu Province (QL2016), and “1311 Talent Plan” of Nanjing University of Posts and Telecommunications (NY2018).

Fumin Ma received her bachelor degree from Henan University, China in 2002, Master degree from Graduate University of Chinese Academy of Sciences in 2005 and the Ph.D. degree from Tongji University, China in 2008. She was a visiting researcher in University College Dublin (UCD) from 2014 to 2015. She is currently a Professor and supervisor for master students at Nanjing University of Finance and Economics. Her research interests include intelligent information processing, process industry

References (20)

There are more references available in the full text version of this article.

Cited by (21)

  • Dynamic three-way neighborhood decision model for multi-dimensional variation of incomplete hybrid data

    2022, Information Sciences
    Citation Excerpt :

    It can effectively improve the computational performance and facilitate knowledge maintenance through using the previously acquired results. Based on this superiority, much attention has been attracted to incorporate the incremental learning technologies into rough set theory, and a tremendous amount of outstanding achievements have been presented when objects, attributes or attribute values in information systems vary individually [5,6,19,22,24,25,45]. However, under dynamic environments, data may be no longer confined to the single-dimensional variation but to the multi-dimensional variations, namely, objects, attributes or attribute values may evolve over time simultaneously [1,9,10,29–31,34].

  • Incremental feature selection by sample selection and feature-based accelerator[Formula presented]

    2022, Applied Soft Computing
    Citation Excerpt :

    In those works, a more natural incremental scenario is that training samples arrive sequentially. As samples arrive, some incremental rough set-based feature selection methods were well established by incrementally updating discernibility matrix, dependency, information entropy and so on [44–51]. For example, a group incremental feature selection approach was presented in [52] by finding three incremental information entropy measures.

View all citing articles on Scopus

Fumin Ma received her bachelor degree from Henan University, China in 2002, Master degree from Graduate University of Chinese Academy of Sciences in 2005 and the Ph.D. degree from Tongji University, China in 2008. She was a visiting researcher in University College Dublin (UCD) from 2014 to 2015. She is currently a Professor and supervisor for master students at Nanjing University of Finance and Economics. Her research interests include intelligent information processing, process industry modeling and simulation.

Mianwei Ding received his bachelor degree from Jincheng College of Nanjing University of Aeronautics and Astronautics, China in 2014, Master degree from Nanjing University of Posts and Telecommunications in 2017. His research interests include rough sets, neural computing.

Tengfei Zhang received his bachelor degree from Henan University, China in 2002. And he respectively received master and Ph.D. degree from Shanghai Maritime University, China in 2004 and 2007. He is currently a Professor and supervisor for master students at Nanjing University of Posts and Telecommunications. His research interests include intelligent information processing, complex system modeling and control.

View full text