Feature selection for dynamic interval-valued ordered data based on fuzzy dominance neighborhood rough set
Introduction
With the development of the information age, various complex data need to be dealt with in different fields, among which interval-valued data is one of the important representatives. Interval-valued data is widely used in the real world, it is usually used to characterize inaccurate and ambiguous information, such as fluctuations of commodity prices [1], changes of temperature [2], and the range of physiological indicators [3]. In multi-criteria decision analysis problems, interval-valued data follows a preference-ordered relation, which is called interval-valued ordered data [4]. In practical applications, interval-valued ordered data evolves over time, i.e., dynamic interval-valued ordered data [5], [6], which brings challenges for efficient data mining in such data.
Feature selection is a common data dimensionality reduction method in data mining, it can identify more relevant features and reduce the dimension of data, thereby improving the classification ability of the learning models [7], [8], [9], [10], [11]. For dynamic data, some traditional feature selection methods have exposed the defects of low computational efficiency. To improve efficiency, feature selection algorithms with incremental technology have attracted increasing research attention [12], [13], [14], [15], [16]. Nevertheless, up to now, there is no incremental feature selection method for dynamic interval-valued ordered data. In order to further complete the research in this field, we study the feature selection method with incremental technology on dynamic interval-valued ordered data.
Rough set theory (RST) is a granular computing tool, which is widely used to deal with uncertain and vague information. Interval-valued data is called interval-valued information system (IvIS) in RST. In recent years, some extended rough set models for IvIS have been successively proposed, as shown in Table 1.
Although some of the dominance-based rough set approach (DRSA) models have been extended to IvODS in the above researches, these models cannot describe the preference-ordered relation between objects in IvODS both qualitatively and quantitatively. The fuzzy preference based rough sets model [25], proposed by Hu et al. can make up for this deficiency. Therefore, it is very meaningful to extend this model to IvODS. But this model is not robust, because it does not consider that the boundaries of interval numbers are easily disturbed by noise, then cause the perturbation of the endpoint values. This shortcoming makes the knowledge granule lack of fault tolerance (flexibility), thus providing decision-makers with wrong information, which may eventually lead to wrong decisions. Inspired by this, we introduce the idea of neighborhood into the fuzzy preference based rough sets model, and propose a new model to make the knowledge granule robust, i.e., the FDNRS model of IvODS.
Uncertainty metric is an important research content of RST. In recent years, RST-based uncertainty metrics for interval-valued data have attracted the attention of many scholars. Some representative works are shown in Table 2. However, these metrics do not take into account the preference-ordered relation of between objects in IvODS. For ordered data, Hu et al. proposed rank conditional entropy and fuzzy rank conditional entropy [26], and then they were applied to feature selection [27] and decision trees [28] for monotonic classification tasks. Inspired by this, we introduce a FDNRS based conditional entropy (called fuzzy dominance neighborhood conditional entropy (FDNCE)) to evaluate the consistency degree of the ordering of samples under features and decisions in IvODS. In this study, the FDNCE is used as a feature evaluation index for feature selection in IvODS.
Feature selection is also called attribute reduction in RST. Some RST-based attribute reduction methods have been extended or further improved for interval-valued data, as shown in Table 3. However, the above attribute reduction method has two insufficiencies. On the one hand, these methods do not consider interval-valued data with a preference-ordered relation. On the other hand, for interval-valued data with dynamic characteristics, these methods expose the disadvantage of high time cost. Because these attribute reduction methods must be executed repeatedly when new data arrives or old data is removed, which causes a lot of unnecessary calculations. Therefore, it is very meaningful to study an efficient attribute reduction method that can be applied to data with dynamic interval-valued ordered data.
The feature selection with incremental mechanism can efficiently extract the necessary attributes from dynamic datasets. In recent years, the research on incremental feature selection has attracted the attention of many scholars. Some recent research works are presented in Table 4. Although scholars have done a lot of works on the research of incremental feature selection methods, these existing methods are not suitable for dynamic interval-valued ordered data. This flaw inspires our study.
In this study, we propose incremental feature selection methods based on FDNRS model for dynamic interval-valued ordered datasets with time-evolving objects. The major contributions of this study are as follows.
We propose a new rough set model FDNRS for IvODS, and give reasonable explanations of the approximate operators of this model. Moreover, the relevant properties of this model are presented and proved.
We define a robust uncertainty metric FDNCE based on FDNRS model, which is used as an uncertainty metric to evaluate the degree of ranking consistency of objects in IvODS. This metric is proven to be non-monotonic, and then is combined with the heuristic feature selection strategy.
Based on the above researches, we propose two incremental feature selection algorithms when a group objects are added to or deleted from an IvODS, respectively.
Comparison experiments are performed on public datasets, and the results indicate that the robustness of the proposed metric and the effectiveness and efficiency of the proposed incremental algorithms.
The remaining of the paper is organized as follows. Section 2 introduces the related knowledge. In Section 3, the FDNRS model of IvODS is proposed, and its relevant properties are investigated. Section 4 proposes FDNCE and a FDNCE-based heuristic non-monotonic feature selection algorithm for IvODS. In Section 5, two incremental feature selection methods are introduced. The results and analysis of our experiments are reported in Section 6. Finally, Section 7 summarizes the study and outlines the further work.
Section snippets
Preliminaries
In this section, some basic concepts are introduced, which can be found in literatures [4], [54].
Fuzzy dominance neighborhood rough set to IvODS
In this section, we propose a new model to IvODS, called FDNRS model. This model qualitatively and quantitatively considers the preference-ordered relation between objects in IvODS. Not only that, the proposed model also combines the idea of neighborhood to avoid the influence of noise for knowledge. The relevant definitions and properties are introduced as follow.
Conditional entropy based on FDNRS and non-monotonic feature selection in IvODS
As a common uncertainty measure, information entropy is widely used in feature selection tasks [27], [28], [39]. In this section, we first propose a conditional entropy based on FDNRS, called FDNCE, and analyze its monotonicity. Afterwards, we define a non-monotonic reduct search strategy using FDNCE. Finally, we introduce a heuristic feature selection algorithm with the non-monotone reduct search strategy.
Incremental feature selection for dynamic IvODS with the variation of multiple objects
For dynamic IvODS, employing the HFS-IvO algorithm to compute a reduct is very time-consuming, especially in large data. Because this algorithm retrains the changed IvODS as a new one, which needs to recalculate knowledge from scratch. To improve efficiency, this section presents two incremental algorithms for feature selection on the basis of HFS-IvO algorithm.
Experiments and analysis
In this section, we perform a series of experiments to test the robustness of the proposed metric and evaluate the performance of the proposed incremental feature selection algorithms. The configuration of computer used for experiments is as follows. CPU is Intel(R) Core(TM) i7-8700. Clock Speed is 3.20 GHz. Memory is 16.0 GB. Operation System is 64-bit Windows 10. The algorithms are coded in Java and run in Java platform. The code of algorithms can be downloaded from the GitHub homepage.1
Conclusion and future work
In this study, we propose incremental feature selection methods based on FDNRS for dynamic interval-valued ordered data. The main works are as follows: (1) We propose a FDNRS model for IvODS and present its relevant properties. (2) Based on the proposed model, a robust conditional entropy (i.e., FDNCE) is proposed for attribute reduction of IvODS. (3) For dynamically adding objects to or deleting objects from an IvODS, we develop two incremental feature selection algorithms accordingly.
CRediT authorship contribution statement
Binbin Sang: Methodology, Validation, Writing - original draft, Writing - review & editing. Hongmei Chen: Conceptualization, Resources, Visualization, Supervision, Project administration, Funding acquisition. Lei Yang: Formal analysis, Data curation. Tianrui Li: Resources, Supervision, Funding acquisition. Weihua Xu: Resources, Funding acquisition. Chuan Luo: Resources, Supervision.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work is supported by the National Natural Science Foundation of China (Nos. 61976182, 61572406, 62076171, 61602327, 61876157, 61976245), Sichuan Key R&D project, China (2020YFG0035), Key program for International S&T Cooperation of Sichuan Province, China (2019YFH0097).
References (64)
- et al.
Fuzzy c-ordered medoids clustering for interval-valued data
Pattern Recognit.
(2016) - et al.
Multivalued type proximity measure and concept of mutual similarity value useful for clustering symbolic patterns
Pattern Recognit. Lett.
(2004) - et al.
Interval ordered information systems
Comput. Math. Appl.
(2008) - et al.
Dynamic computing rough approximations approach to time-evolving information granule interval-valued ordered information system
Appl. Soft Comput.
(2017) - et al.
An incremental algorithm for attribute reduction with variable precision rough sets
Appl. Soft Comput.
(2016) - et al.
An incremental approach for attribute reduction based on knowledge granularity
Knowl.-Based Syst.
(2016) - et al.
Related families-based attribute reduction of dynamic covering decision information systems
Knowl.-Based Syst.
(2018) - et al.
Incremental approaches for feature selection from dynamic data with the variation of multiple objects
Knowl.-Based Syst.
(2019) - et al.
Rough set theory for the interval-valued fuzzy information systems
Inform. Sci.
(2008) - et al.
Fuzzy rough set theory for the interval-valued fuzzy information systems
Inform. Sci.
(2008)
A rough set approach for the discovery of classification rules in interval-valued information systems
Internat. J. Approx. Reason.
Dominance-based rough set approach to incomplete interval-valued information system
Data Knowl. Eng.
Variable-precision-dominance-based rough set approach to interval-valued information systems
Inform. Sci.
-dominance relation and rough sets in interval-valued information systems
Inform. Sci.
Fuzzy preference based rough sets
Inform. Sci.
Uncertainty measurement for interval-valued decision systems based on extended conditional entropy
Knowl.-Based Syst.
Uncertainty measurement for interval-valued information systems
Inform. Sci.
Information granulation and uncertainty measures in interval-valued intuitionistic fuzzy information systems
European J. Oper. Res.
Uncertainty measurement for incomplete interval-valued information systems based on -weak similarity
Knowl.-Based Syst.
New measures of uncertainty for an interval-valued information system
Inform. Sci.
Multi-confidence rule acquisition and confidence-preserved attribute reduction in interval-valued decision systems
Internat. J. Approx. Reason.
A fuzzy rough set approach for incremental feature selection on hybrid information systems
Fuzzy Sets and Systems
Incremental approaches for updating reducts in dynamic covering information systems
Knowl.-Based Syst.
A group incremental feature selection for classification using rough set theory based genetic algorithm
Appl. Soft Comput.
Discernibility matrix based incremental attribute reduction for dynamic data
Knowl.-Based Syst.
Incremental approaches to updating reducts under dynamic covering granularity
Knowl.-Based Syst.
Incremental feature selection based on fuzzy rough sets
Inform. Sci.
Discernibility matrix based incremental feature selection on fused decision tables
Internat. J. Approx. Reason.
Approximate distribution reducts in inconsistent interval-valued ordered decision tables
Inform. Sci.
Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic
Fuzzy Sets and Systems
Incremental updating of rough approximations in interval-valued information systems under attribute generalization
Inform. Sci.
Dynamic dominance rough set approach for processing composite ordered data
Knowl.-Based Syst.
Cited by (42)
Nonparametric estimation and forecasting of interval-valued time series regression models with constraints
2024, Expert Systems with ApplicationsAcquisition of representative objects and attribute reductions based on generalized decisions of dominance-based rough set approach
2024, Engineering Applications of Artificial IntelligenceFeature selection of dominance-based neighborhood rough set approach for processing hybrid ordered data
2024, International Journal of Approximate ReasoningA new method for feature selection based on weighted k-nearest neighborhood rough set
2024, Expert Systems with Applications