Incremental feature weighting for fuzzy feature selection
Introduction
The online monitoring of high-dimensional dynamical data has become increasingly important for observing systems in a wide range of advanced applications, such as telephone records, large sets of web pages, multimedia data, and sets of retail chain transactions [1]. Due to the complexity of dynamical systems, incremental learning is essential for knowledge discovery, which is an important aspect of human intelligence. Many applications need to process data in a sequence, which has led to the generation of a wide range of incremental learning methods [2], [3]. In particular, two types of situations must be addressed. In the first situation, the data set cannot be collected in one pass and batch computing cannot be conducted, e.g., online applications [4] or interaction queries [5]. In the second situation, the data set is excessively large and it cannot be calculated in one pass due to limitations in terms of the computation capability and the memory size. Thus, the data set must be divided into segments and added successively. The relationships between the newly data added and the stored data structure must then be analyzed in order to acquire new knowledge. It is difficult to apply an incremental feature selection process to compensate for different operational modes because the importance level of features can change dynamically during the overall learning process. For example, specific features may be much more important initially than later in the process. Therefore, many incremental feature selection algorithms have been proposed for synchronously updating the classifiers [6], [7], [8], [10]. In [6], the incrementally selected features for representing the information in the overall data set are evaluated using classifiers, which increases the computational cost of the feature selection process. In [7], a wrapper feature selection algorithm with an incremental Bayesian classifier (WFSIBC) was proposed where a greedy algorithm is used to find the optimal feature subset, which increases the time complexity and space complexity. The incremental feature neighborhood rough set (IFNRS) algorithm was presented by [8] to select the features incrementally but without considering constraints on discrete data. An incremental feature selection algorithm based on information granularity was proposed by [9] where the non-matrix structure improves the efficiency of the algorithm. Other incremental feature selection algorithms such as those presented by [10], [11], [12], [13] can be employed for regression prediction. In the method described by [10], the useful features are selected incrementally by combining clustering algorithms with four new feature quality indices but the number of clusters needs to be predefined. A genetic algorithm and the K-means algorithm are used to determine the objective function and the optimal range, respectively, in the method proposed by [11] but the features subset is not globally optimal. An online unsupervised multi-view feature selection (OMVFS) algorithm was presented by [12], where feature selection is embedded directly in a clustering algorithm that retains the local information structure, but the parameters need to be set for the clustering algorithm, which leads to poor adaptability. The feature selection based on data streams (FSDS) algorithm was proposed by [13] for determining the importance level of features according to the regression results based on approximate data, but it has no independent evaluation index and it is mainly employed for updating the regression model.
It is difficult to apply an incremental feature selection process to a fuzzy feature space with fuzzy linguistic terms in order to improve the interpretability of the system. A significant problem when learning fuzzy features from data is the so-called curse of dimensionality, where the fuzzy feature space has higher dimensionality than the original feature space, which increases the complexity of the system. This problem was addressed by [14] where the fuzzy features were selected by evaluating the importance level of each fuzzy feature according to the minimum–maximum learning rule and the extended matrix built based on the fuzzy data, but this algorithm is unsuitable for dynamic systems. A local fuzzy feature selection strategy based on switching to a neighboring model was proposed by [15] for capturing the local dynamics, but is not possible to inactivate and reactivate the input variables in different stages. An online incremental feature weighting method was presented by [16] for updating the classifiers according to a separability criterion, where the weight was calculated for each feature and compared with all of the features. The most important feature received a weight of 1 and the others were weighted between 0 and 1. The weights of all the features were then used directly in the model for updating and inference, but this method may lead automatically to the curse of dimension reduction during rule evolution. Soft feature selection using a feature weighting method was presented by [17] based on a classifier, where the Fisher separability criterion was applied in the feature space to select suitable features, although it was more convenient to apply in the empirical feature space. An adaptive incremental fuzzy feature selection method using a neo-fuzzy neural network was presented by [18], which selects the model input and updates the network weight simultaneously, although this method leads to redundant features because the dependencies between the features are not analyzed. As described above, the inclusion of feature weights discriminates features that are more important and less important, thereby providing additional information to users and experts by giving insights into the feature selection process and its interpretability. However, in order to implement feature selection, these algorithms depend on a classifier or regression model, which increases the complexity of the model selection procedure.
Section snippets
Proposed approach
In this study, we propose an incremental feature weighting for fuzzy feature selection (IWFFS) algorithm with two main phases: offline fuzzy feature selection and online fuzzy feature selection. First, the fuzzy data are partitioned by sliding windows. In the first sliding window, the offline fuzzy feature selection algorithm is applied, where the weight of each fuzzy input feature is calculated according to the mutual information between the fuzzy input features and the output feature, and
Basic concept
The incremental feature weighting concept is an attempt to reduce the curse of dimensionality. This method is generally used for incremental feature selection by assigning continuous weights in the range of , which can be treated as the importance levels of the features in dynamic systems, instead of using crisp weights from . When the feature weights approach 0, their importance level is relatively low compared with others that have values near 1. The main advantage of feature
IWFFS framework
To enhance the effectiveness of incremental fuzzy feature selection, reduce the complexity, and identify the evolving relationships for each fuzzy input feature, the proposed framework for the IWFFS is shown in Fig. 1. First, the fuzzy data are partitioned using sliding windows, where offline fuzzy feature selection is used in the first sliding window and online fuzzy feature selection is applied in each of the sliding windows successively. During the offline fuzzy feature selection process,
Experimental setup
Several experiments were conducted to test the proposed dynamic fuzzy features selection method. All of the data sets were downloaded from the UCI Machine Learning Repository [26]. The statistics for the data sets used in our experiments are shown in Table 1, which we employed to conduct all of our experiments and analysis. Table 1 shows that some of the data sets had continuous features but the classes were discrete, such as the classification data sets comprising SD, BCH, Glass, Liver, Wine,
Conclusion
In this study, we proposed the IWFFS algorithm, which mainly evaluates the importance levels of the fuzzy features according to the weights of the fuzzy features in the current window. In the offline fuzzy features selection stage, the optimal fuzzy input features subset is obtained by the backward features selection algorithm and fuzzy features selection index. In the online fuzzy features selection stage, based on the candidate fuzzy features subset in the current sliding window and the
Acknowledgements
This study was supported by the National Natural Science Foundation of China (Grant No. 61572073), National Key R&D Program of China (No. 2017YFB0306403).
References (31)
- et al.
An incremental approach for attribute reduction based on knowledge granularity
J. Shandong Univ.
(2016) - et al.
Fuzzy feature selection based on min–max learning rule and extension matrix
Pattern Recognit.
(2008) On-line incremental feature weighting in evolving fuzzy classifiers
Fuzzy Sets Syst.
(2011)Knowledge discovery from data streams
Intell. Data Anal.
(2009)- et al.
Evolving Intelligent Systems Methodology and Applications
(2010) - et al.
Learning in Non-Stationary Environments: Methods and Applications
(2012) - et al.
Online neural network model for non-stationary and imbalanced data stream classification
Int. J. Mach. Learn. Cybern.
(2014) - et al.
Query interaction based approach for horizontal data partitioning
Int. J. Data Warehous. Min.
(2015) - et al.
Online feature selection and its applications
IEEE Trans. Knowl. Data Eng.
(2006) - et al.
Feature selection for transient stability assessment based on improved maximal relevance and minimal redundancy criterion
Proc. CSEE
(2013)