Adaptive neighborhood granularity selection and combination based on margin distribution optimization

doi:10.1016/j.ins.2013.06.012

Information Sciences

Volume 249, 10 November 2013, Pages 1-12

https://doi.org/10.1016/j.ins.2013.06.012 Get rights and content

Abstract

Granular computing aims to develop a granular view for interpreting and solving problems. The model of neighborhood rough sets is one of effective tools for granular computing. This model can deal with complex tasks of classification learning. Despite the success of the neighborhood model in attribute reduction and rule learning, it still suffers from the issue of granularity selection. Namely, it is an open problem to select a proper granularity of neighborhood for a specific task. In this work, we explore ensemble learning techniques for adaptively evaluating and combine the models derived from multiple granularity. In the proposed framework, base classifiers are trained in different granular spaces. The importance of base classifiers is then learned by optimizing the margin distribution of the combined system. Experimental analysis shows that the proposed method can adaptively select a proper granularity, and combining the models trained in multi-granularity spaces leads to competent performance.

Introduction

Granular computing utilizes information granules, drawn together by indistinguishability, similarity, proximity or functionality, to develop a granular view of the world and solve problems described with incomplete, uncertain, or vague information [16], [37], [38]. There are usually two basic issues with granular computing, including construction of information granules and computation with these granules [34], [39]. The representative granular computing models include fuzzy sets, rough sets [11], [15], [18], fuzzy rough sets [19], [33], neighborhood rough sets [6], [7], covering rough set [42], [43], and so on. Neighborhood rough set is one of the most effective granular computing models in mining heterogeneous data. It has been successfully applied in vibration diagnosis [40], cancer recognition [5] and tumor classification [29].

Neighborhood rough sets extract information granules by computing the neighborhoods of samples. Thus feature space is granulated into a family of neighborhood information granules. Hu et al. introduced neighborhood attribute reduction and a classification algorithm based on the neighborhood model [6]. For interpretation of neighborhood information granules, partition of universe is replaced by neighborhood covering and a neighborhood covering reduction based approach was derived to extract rules from numerical data [1].

How to select a proper granularity is a key problem in granular computing [32], [17], [13]. The sizes of granules, the relations between granules, and the operations with the granules provide the essential ingredients for developing a theory of granular computing [38]. The size of neighborhood has effect on consistency of neighborhood spaces and their approximation ability. If the neighborhood is small, the consistency of classification in the neighborhood space would be large. As shown in Fig. 1, the test sample may be misclassified if the granularity is not correctly set for neighborhood rough sets [6] and KNN classifier [12]. In [8], the impact of neighborhood size on attribute reduction based on neighborhood dependency was discussed. Even though the size of neighborhood of each sample varies according to their position in feature space for neighborhood covering reduction [1], the selection of neighborhood size is still up to empirical values and is still an open problem.

Given a learning task, we may obtain diverse results in different granular spaces. Hence, combining these patterns may lead to performance improvement. As illustrated in Fig. 1, we can recognize a person from the global face or local patch [4]. The combination of global and local information may lead to great improvement in recognition performance [27], [41]. It is known that there are multiple attribute reducts that keep the discrimination ability of the original feature space. In different granular spaces, we can get a set of attribute reducts with complementary information. We can combine the outputs from different granular spaces.

Boosting [2] and AdaBoost [21] are the most typical and successful ensemble learning methods. They learn the weights of base classifies and the final output is a linear weighted combination of the individual outputs. Schapire [23] explained AdaBoost from margin distribution and gave the generalization bound. In [35], a bagging pruning technique was proposed based margin distribution optimization. In [41], by optimizing margin distribution, an ensemble face recognition method was proposed to combine multi-scale outputs.

In this paper, we propose a technique to select and combine different granularity based on margin distribution optimization. In different neighborhood granular spaces, we get a corresponding classification model. By optimizing margin distribution of the final decision function, we derive the weights of different granularity. The granularity with the largest weight is considered to be optimal. In addition, weights can be used to rank the granularity or combine recognition results of different granular spaces. Experimental results show that the proposed granularity selection and combination method can significantly improve the classification performance.

The structure of this paper is described as follows. In Section 2, neighborhood based granular models are introduced. Section 3 shows the granularity selection and combination method. In Section 4, experiment analysis is given to show the performance of the proposed method. Finally, conclusions and future work are presented in Section 5.

Section snippets

Neighborhood granular models

In this section, the neighborhood based granular computing model is introduced. The granularity sensitivity of the neighborhood granular models is discussed in Section 2.3.

Granularity selection and combination

Both neighborhood classifiers and neighborhood attribute reduction are sensitive to the granularity δ. The granularity selection is a non-trivial task. The information of different granularity may be different and complementary to each other. Assume that three features {13, 1, 10} of wine are selected by neighborhood feature selection. Then rules are learned separately in feature subspace {13, 10} and feature subspace {1, 10}, as shown in Fig. 4. The learned rules are different and they may be

Experiment analysis

To show the effectiveness of the proposed method, we firstly show the granularity weights in Section 4.1. Then for granularity-sensitive classifiers, we use NEC and KNN as an example to show the superiority of the proposed GSC_MD in Section 4.2. Finally, the experiment for multi-granularity subspaces based classification is conducted.

Conclusions and future work

Neighborhood rough set is a granularity sensitive granular computing model. We can train multiple models from different granularity, which leads to diverse granular views of a learning task. As base classifiers trained in different granular spaces are complementary, in this paper we explore ensemble learning techniques to solve the granularity selection and combination problem. By optimizing margin distribution, we learn the weights of different granularity. Then weights are used for

Acknowledgments

This work is partly supported by National Program on Key Basic Research Project under Grant 2013CB329304, National Natural Science Foundation of China under Grants 61222210 and 61105054 and New Century Excellent Talents in University under Grant NCET-12-0399.

References (43)

Y. Du et al.
Rule learning for classification based on neighborhood covering reduction
Information Sciences
(2011)
Q. Hu et al.
Neighborhood rough set based heterogeneous feature subset selection
Information Sciences
(2008)
Q. Hu et al.
Neighborhood classifiers
Expert Systems with Applications
(2008)
Q. Hu et al.
Large-margin nearest neighbor classifiers via sample weight learning
Neurocomputing
(2011)
M. Kryszkiewicz
Rough set approach to incomplete information systems
Information Sciences
(1998)
Y. Liao et al.
Use of k-nearest neighbor classifier for intrusion detection1
Computers & Security
(2002)
Z. Pawlak
Rough set approach to knowledge-based decision support
European Journal of Operational Research
(1997)
Y. Qian et al.
Mgrs: a multi-granulation rough set
Information Sciences
(2010)
A. Radzikowska et al.
A comparative study of fuzzy rough sets
Fuzzy Sets and Systems
(2002)
X. Tan et al.
Face recognition from a single image per person: a survey
Pattern Recognition
(2006)

S. Wang et al.

Tumor classification by combining pnn classifier ensemble with neighborhood rough set based gene reduction

Computers in Biology and Medicine

(2010)

W. Wu et al.

Theory and applications of granular labelled partitions in multi-scale decision tables

Information Sciences

(2011)

W. Wu et al.

Generalized fuzzy rough sets

Information Sciences

(2003)

W. Wu et al.

Neighborhood operator systems and approximations

Information Sciences

(2002)

Z. Xie et al.

Margin distribution based bagging pruning

Neurocomputing

(2012)

Y. Yao

Relational interpretations of neighborhood operators and rough set approximation operators

Information Sciences

(1998)

W. Zhu

Topological approaches to covering rough sets

Information Sciences

(2007)

W. Zhu et al.

Reduction and axiomization of covering generalized rough sets

Information Sciences

(2003)

Y. Freund, R. Schapire, A desicion-theoretic generalization of on-line learning and an application to boosting, in:...

R. Gilad-Bachrach, A. Navot, N. Tishby, Margin based feature selection-theory and algorithms, in: Proceedings of the...

B. Heisele, P. Ho, T. Poggio, Face recognition with support vector machines: global versus component-based approach,...

Cited by (53)

Granular structure evaluation and selection based on justifiable granularity principle
2024, Information Sciences
Granular structures are fundamental components of human granulation intelligence and different views or scales of granulation result in different granular structures. Therefore, the evaluation and selection of optimal granular structures can lay the foundation for problem-solving. Information granules are basic components of granular structures. The principle of justifiable granularity presents a coherent method for designing information granules. Therefore, this study performs granular structure evaluation and selection based on the justifiable granularity principle. First, it proposes a new evaluation criterion for granular structures that considers the coverage and specificity of all information granules in a granular structure. Thereafter, coverage and specificity are evaluated based on a core sample of the information granule. Subsequently, a detailed formulation is provided to compute the significance of the granular structure according to the proposed evaluation criterion. Finally, a general framework for granular structure selection is presented, and a detailed algorithm for selecting the optimal granular structure with the aid of the justifiable granularity principle is provided. The proposed method is employed to determine the optimal attribute and select the optimal neighborhood size for neighborhood classifiers. Experiments and analyses have demonstrated the necessity, reasonableness, and effectiveness of the proposed method.
Three-way evidence theory-based density peak clustering with the principle of justifiable granularity
2024, Applied Soft Computing
Clustering by fast search and find of density peaks (DPC) is an effective clustering approach that can find all the cluster centers at once with just one parameter and without iterative processing. However, the cutoff distance, a key parameter of density measurement in the DPC approach, affects the quality of the final clustering results. Its selection relies on experimental experience and lacks of a semantic explanation. Furthermore, the allocation strategy of the traditional DPC approach may cause several points to be assigned incorrectly, leading to subsequent points being assigned incorrectly and ultimately forming continuous allocation errors. To overcome the deficiencies, this paper proposes a novel three-way evidence theory-based density peak clustering with the principle of justifiable granularity (3 W-PEDP). First, the computation of the cutoff distance is converted into the search for nearest neighbors. From the perspective of granular computing, 3 W-PEDP transforms the neighbor selection issue into the construction of justifiable granularity. And the optimal neighbors can be achieved with the construction of coverage and specificity criteria. Second, inspired by three-way clustering, we adopt a two-stage method for sample allocation. On the one hand, for core point allocation, a two-layer nearest neighbor is constructed based on the achieved optimal neighbors. On the other hand, we designed a new evidence mass function to guide us in assigning the remaining points. In this novel evidence mass function, not only the labels of the assigned samples are considered, but also the information of the neighborhoods around the unassigned samples is fused. Finally, we assess the effectiveness of 3 W-PEDP on numerous public synthetic datasets and UCI real-world datasets. Then, detail comparing results with several popular clustering methods are presented. In addition, experimental studies verify the effectiveness of constructing justifiable granularity in selecting the optimal neighbors. The experimental results demonstrate 3 W-PEDP has good adaptability and robustness, which can achieve better clustering performance. Our source code is available at https://github.com/Luyangabc/3W-PEDP.
Hierarchical neighborhood entropy based multi-granularity attribute reduction with application to gene prioritization
2022, International Journal of Approximate Reasoning
As a prominent model of granular computing, neighborhood rough set provides clear granularity organization and expression in terms of inherent parameter (neighborhood radius). Such characteristic is widely captured in a plenitude of attribute reduction procedures, while igniting a tricky issue of tuning parameters. In this study, we therefore propose a parameter-free multi-granularity attribute reduction scheme. Fundamentally, our scheme applies three-way decision as thinking in threes. First, data-aware multi-granularity structure is automatically induced from self-contained distance space instead of manually edited or appointed granularities. Second, a novel multi-granularity feature evaluation criterion named hierarchical neighborhood entropy is defined to measure the feature significance. Finally, a sequential forward searching algorithm is designed to find the optimal reduct. With application to gene prioritization, our method performed on microarray data is experimentally demonstrated to be more effective and efficient in differentially expressed genes discovery as compared with other well-established attribute reduction algorithms.
Hybrid filter–wrapper attribute selection with alpha-level fuzzy rough sets
2022, Expert Systems with Applications
Selection of important attributes/features from decision information systems plays a vital role in data mining and machine learning tasks. It is regarded as a very interesting, but challenge problem, especially when faced with continuous numerical/real attributes. Neighborhood rough sets and fuzzy rough sets based attribute selection methods are well-known for dealing effectively with numerical/real attributes. However, characteristics of data may be described incompletely by neighborhood classes in the neighborhood rough set model, while the fuzzy rough sets based approach is still quite time-consuming because of the complex calculations on fuzzy equivalence classes. To address these limitations, we apply the concept of sets of level $α$ ( $α$ -cut sets) in the fuzzy set theory to construct $α$ -level fuzzy equivalence classes which provide a foundation for developing basic concepts of a new $α$ -level fuzzy rough set model. We will see that under the properties of the $α$ -cut sets, the $α$ -level fuzzy equivalence classes not only help to significantly reduce the computational cost, but also preserve most of the information about the relationships between the objects, and even can decrease some noise in the data. Based on the $α$ -level fuzzy rough set model, we define new reducts and then propose the FSFCF algorithm for attribute subset selection from the decision information systems containing continuous data. It is important to emphasize some advantages of the proposed method. First, in order to evaluate and select optimal attributes, we use an $α$ -level fuzzy certainty factor with the comprehensive consideration to all objects in the universe. Second, the FSFCF algorithm is designed in the hybrid filter–wrapper approach to reduce the size of selected attribute subset as well as enhance the classification accuracy. Therefore, the proposed method can significantly improve the performance of the attribute selection for continuous data. To verify the effectiveness of FSFCF, we implement experiments on a variety of real-world data sets. The results demonstrated that the proposed method outperforms the compared state-of-the-art methods in terms of the computational time, the size of reduct and the classification accuracy for almost all of data sets.
Equidistant k-layer multi-granularity knowledge space
2021, Knowledge-Based Systems
A multi-granularity knowledge space is a computational model that simulates human thinking and solves complex problems. However, as the amount of data increases, the multi-granularity knowledge space will have a larger number of layers, which will reduce its problem-solving ability. Therefore, we define a knowledge space distance measurement and propose two algorithms to select $k$ representative layers from the multi-granularity knowledge space, where $k$ is specified by the user according to the needs in problem solving, and $k$ representative layers are approximately equidistant. First, we propose a knowledge space distance to measure the distance between any two layers in a multi-granularity knowledge space with superset-subset relationships, and the rationality of the knowledge space distance is verified by theory and experiment. Second, relying on the knowledge space distance and knowledge space distance variance, we propose two algorithms (i.e., a deterministic algorithm and a heuristic algorithm) to select an approximate equidistant $k$ -layer multi-granularity knowledge space. Third, in addition to verifying the effectiveness of the knowledge space distance, the knowledge space distance variance, the deterministic algorithm and the heuristic algorithm, we verify that the equidistant $k$ -layer multi-granularity knowledge space is more efficient than the original multi-granularity knowledge space.
Boosted K-nearest neighbor classifiers based on fuzzy granules
2020, Knowledge-Based Systems
K-nearest neighbor (KNN) is a classic classifier, which is simple and effective. Adaboost is a combination of several weak classifiers as a strong classifier to improve the classification effect. These two classifiers have been widely used in the field of machine learning. In this paper, based on information fuzzy granulation, KNN and Adaboost, we propose two algorithms, a fuzzy granule K-nearest neighbor (FGKNN) and a boosted fuzzy granule K-nearest neighbor (BFGKNN), for classification. By introducing granular computing, we normalize the process of solving problem as a structured and hierarchical process. Structured information processing is focused, so the performance including accuracy and robust can be enhanced to data classification. First, a fuzzy set is introduced, and an atom attribute fuzzy granulation is performed on samples in the classified system to form fuzzy granules. Then, a fuzzy granule vector is created by multiple attribute fuzzy granules. We design the operators and define the measure of fuzzy granule vectors in the fuzzy granule space. And we also prove the monotonic principle of the distance of fuzzy granule vectors. Furthermore, we also give the definition of the concept of K-nearest neighbor fuzzy granule vector and present FGKNN algorithm and BFGKNN algorithm. Finally, we compare the performance among KNN, Back Propagation Neural Network (BPNN), Support Vector Machine (SVM), Logistic Regression (LR), FGKNN and BFGKNN on UCI data sets. Theoretical analysis and experimental results show that FGKNN and BFGKNN have better performance than that of the methods mentioned above if the appropriate parameters are given.

View all citing articles on Scopus

View full text

Adaptive neighborhood granularity selection and combination based on margin distribution optimization

Abstract

Introduction

Section snippets

Neighborhood granular models

Granularity selection and combination

Experiment analysis

Conclusions and future work

Acknowledgments

Information Sciences

Information Sciences

Expert Systems with Applications

Neurocomputing

Information Sciences

Computers & Security

European Journal of Operational Research

Information Sciences

Fuzzy Sets and Systems

Pattern Recognition

Computers in Biology and Medicine

Information Sciences

Information Sciences

Information Sciences

Neurocomputing

Information Sciences

Information Sciences

Information Sciences