Elsevier

Information Sciences

Volume 249, 10 November 2013, Pages 1-12
Information Sciences

Adaptive neighborhood granularity selection and combination based on margin distribution optimization

https://doi.org/10.1016/j.ins.2013.06.012Get rights and content

Abstract

Granular computing aims to develop a granular view for interpreting and solving problems. The model of neighborhood rough sets is one of effective tools for granular computing. This model can deal with complex tasks of classification learning. Despite the success of the neighborhood model in attribute reduction and rule learning, it still suffers from the issue of granularity selection. Namely, it is an open problem to select a proper granularity of neighborhood for a specific task. In this work, we explore ensemble learning techniques for adaptively evaluating and combine the models derived from multiple granularity. In the proposed framework, base classifiers are trained in different granular spaces. The importance of base classifiers is then learned by optimizing the margin distribution of the combined system. Experimental analysis shows that the proposed method can adaptively select a proper granularity, and combining the models trained in multi-granularity spaces leads to competent performance.

Introduction

Granular computing utilizes information granules, drawn together by indistinguishability, similarity, proximity or functionality, to develop a granular view of the world and solve problems described with incomplete, uncertain, or vague information [16], [37], [38]. There are usually two basic issues with granular computing, including construction of information granules and computation with these granules [34], [39]. The representative granular computing models include fuzzy sets, rough sets [11], [15], [18], fuzzy rough sets [19], [33], neighborhood rough sets [6], [7], covering rough set [42], [43], and so on. Neighborhood rough set is one of the most effective granular computing models in mining heterogeneous data. It has been successfully applied in vibration diagnosis [40], cancer recognition [5] and tumor classification [29].

Neighborhood rough sets extract information granules by computing the neighborhoods of samples. Thus feature space is granulated into a family of neighborhood information granules. Hu et al. introduced neighborhood attribute reduction and a classification algorithm based on the neighborhood model [6]. For interpretation of neighborhood information granules, partition of universe is replaced by neighborhood covering and a neighborhood covering reduction based approach was derived to extract rules from numerical data [1].

How to select a proper granularity is a key problem in granular computing [32], [17], [13]. The sizes of granules, the relations between granules, and the operations with the granules provide the essential ingredients for developing a theory of granular computing [38]. The size of neighborhood has effect on consistency of neighborhood spaces and their approximation ability. If the neighborhood is small, the consistency of classification in the neighborhood space would be large. As shown in Fig. 1, the test sample may be misclassified if the granularity is not correctly set for neighborhood rough sets [6] and KNN classifier [12]. In [8], the impact of neighborhood size on attribute reduction based on neighborhood dependency was discussed. Even though the size of neighborhood of each sample varies according to their position in feature space for neighborhood covering reduction [1], the selection of neighborhood size is still up to empirical values and is still an open problem.

Given a learning task, we may obtain diverse results in different granular spaces. Hence, combining these patterns may lead to performance improvement. As illustrated in Fig. 1, we can recognize a person from the global face or local patch [4]. The combination of global and local information may lead to great improvement in recognition performance [27], [41]. It is known that there are multiple attribute reducts that keep the discrimination ability of the original feature space. In different granular spaces, we can get a set of attribute reducts with complementary information. We can combine the outputs from different granular spaces.

Boosting [2] and AdaBoost [21] are the most typical and successful ensemble learning methods. They learn the weights of base classifies and the final output is a linear weighted combination of the individual outputs. Schapire [23] explained AdaBoost from margin distribution and gave the generalization bound. In [35], a bagging pruning technique was proposed based margin distribution optimization. In [41], by optimizing margin distribution, an ensemble face recognition method was proposed to combine multi-scale outputs.

In this paper, we propose a technique to select and combine different granularity based on margin distribution optimization. In different neighborhood granular spaces, we get a corresponding classification model. By optimizing margin distribution of the final decision function, we derive the weights of different granularity. The granularity with the largest weight is considered to be optimal. In addition, weights can be used to rank the granularity or combine recognition results of different granular spaces. Experimental results show that the proposed granularity selection and combination method can significantly improve the classification performance.

The structure of this paper is described as follows. In Section 2, neighborhood based granular models are introduced. Section 3 shows the granularity selection and combination method. In Section 4, experiment analysis is given to show the performance of the proposed method. Finally, conclusions and future work are presented in Section 5.

Section snippets

Neighborhood granular models

In this section, the neighborhood based granular computing model is introduced. The granularity sensitivity of the neighborhood granular models is discussed in Section 2.3.

Granularity selection and combination

Both neighborhood classifiers and neighborhood attribute reduction are sensitive to the granularity δ. The granularity selection is a non-trivial task. The information of different granularity may be different and complementary to each other. Assume that three features {13, 1, 10} of wine are selected by neighborhood feature selection. Then rules are learned separately in feature subspace {13, 10} and feature subspace {1, 10}, as shown in Fig. 4. The learned rules are different and they may be

Experiment analysis

To show the effectiveness of the proposed method, we firstly show the granularity weights in Section 4.1. Then for granularity-sensitive classifiers, we use NEC and KNN as an example to show the superiority of the proposed GSC_MD in Section 4.2. Finally, the experiment for multi-granularity subspaces based classification is conducted.

Conclusions and future work

Neighborhood rough set is a granularity sensitive granular computing model. We can train multiple models from different granularity, which leads to diverse granular views of a learning task. As base classifiers trained in different granular spaces are complementary, in this paper we explore ensemble learning techniques to solve the granularity selection and combination problem. By optimizing margin distribution, we learn the weights of different granularity. Then weights are used for

Acknowledgments

This work is partly supported by National Program on Key Basic Research Project under Grant 2013CB329304, National Natural Science Foundation of China under Grants 61222210 and 61105054 and New Century Excellent Talents in University under Grant NCET-12-0399.

References (43)

  • S. Wang et al.

    Tumor classification by combining pnn classifier ensemble with neighborhood rough set based gene reduction

    Computers in Biology and Medicine

    (2010)
  • W. Wu et al.

    Theory and applications of granular labelled partitions in multi-scale decision tables

    Information Sciences

    (2011)
  • W. Wu et al.

    Generalized fuzzy rough sets

    Information Sciences

    (2003)
  • W. Wu et al.

    Neighborhood operator systems and approximations

    Information Sciences

    (2002)
  • Z. Xie et al.

    Margin distribution based bagging pruning

    Neurocomputing

    (2012)
  • Y. Yao

    Relational interpretations of neighborhood operators and rough set approximation operators

    Information Sciences

    (1998)
  • W. Zhu

    Topological approaches to covering rough sets

    Information Sciences

    (2007)
  • W. Zhu et al.

    Reduction and axiomization of covering generalized rough sets

    Information Sciences

    (2003)
  • Y. Freund, R. Schapire, A desicion-theoretic generalization of on-line learning and an application to boosting, in:...
  • R. Gilad-Bachrach, A. Navot, N. Tishby, Margin based feature selection-theory and algorithms, in: Proceedings of the...
  • B. Heisele, P. Ho, T. Poggio, Face recognition with support vector machines: global versus component-based approach,...
  • Cited by (53)

    View all citing articles on Scopus
    View full text