Elsevier

Pattern Recognition

Volume 39, Issue 3, March 2006, Pages 488-490
Pattern Recognition

Rapid and brief communication
FuzzyBagging: A novel ensemble of classifiers

https://doi.org/10.1016/j.patcog.2005.10.002Get rights and content

Abstract

In this work, a new method for the creation of classifier ensembles is introduced. The patterns are partitioned into clusters to group together similar patterns, a training set is built using the patterns that belong to a cluster. Each of the new sets is used to train a classifier. We show that the approach here presented, called FuzzyBagging, obtains performance better than Bagging.

Introduction

Multiclassifier systems are special cases of approaches that integrate several data-driven models for the same problem. A key goal is to obtain a better composite global model, with more accurate and reliable estimates. The combination methods proposed in the literature are based on “voting” rules, statistical techniques, belief functions, and other “classifier fusion” schemes. As an example, the “majority” voting rule interprets each classification result as a “vote” for one of the data classes and assigns the input pattern to the class receiving the majority of votes. Such methods assume that, for each pattern, different classifiers make different classification errors. Recently, a number of classifier combination methods, called ensemble methods, have been proposed in the field of machine learning [1]. Given a single classifier, called the base classifier, a set of classifiers can be automatically generated. In this paper, we propose a new ensemble creation method that is based on Bagging [1] and Fuzzy C-Means clustering [2]. Bagging, given a training set S of size n, generates K new training sets S1;;SK , each of size n, by randomly drawing elements of the original training set, where the same element may be drawn multiple times. In this paper, the new training sets S1;;SK are generated dividing into K overlapping clusters the training set S, the training set Si is build using the patterns that belong to ith cluster. We train a classifier for each training set Si, we classify each pattern of the test set using these classifiers, then we combine these classifiers with the “max-rule”.

Section snippets

System

The new training sets S1;;SK are generated partitioning into K clusters the training set S, a different training set Si is build using the patterns that belong to ith cluster. We assign to a cluster I, the 63.2% of patterns of each class with higher membership to that cluster.

We have chosen this value, since in the “base” Bagging, if the probability of being drawn is equally distributed over the training set S, 63.2 is the percentage of training elements contained in each modified training set

Experimental results

We conducted our experiments on 4 datasets of varying complexity, 3 coming from the UCI Repository of Machine Learning Databases [3] and the remaining (NIST4) from [4]. A summary of the characteristics of these datasets (number of attributes, number of examples, number of classes) is reported in Table 1. For all the databases but NIST4 the same testing protocol has been adopted: we averaged the results over 10 tests, each time randomly resampling training and test sets (containing half of the

Conclusions

In this work, we propose to use Fuzzy C-Means to obtain several training set and then train a classifier on each training set. We combine the output of classifiers using the “max rule”. The experimental results prove that our new approach is a successful attempt to obtain an error reduction with respect to the performance of standard Bagging. We plan in the future to evaluate the performance of our FuzzyBagging as a function of the percentage of training elements contained in each modified

Acknowledgements

This work has been supported by Italian PRIN prot. 2004098034 and by European Commission IST-2002-507634 Biosecure NoE projects.

References (5)

  • L. Breiman

    Bagging predictors

    Mach. Learn.

    (1996)
  • J.C. Bezdek

    Pattern Recognition with Fuzzy Objective Function Algorithms

    (1981)
There are more references available in the full text version of this article.

Cited by (31)

  • Effect of ensemble classifier composition on offline cursive character recognition

    2013, Information Processing and Management
    Citation Excerpt :

    The class chosen by most base classifiers is the final verdict of the ensemble classifier in majority voting. There are a number of variants of bagging and aggregation approaches including random forests (Breiman, 2001), ordered aggregation (Munoz, Lobato, & Suarez, 2009), and fuzzy bagging (Nanni & Lumini, 2006). In boosting (Schapire, 1990), the base classifiers are also trained on subsets of the training data that are created by re-sampling the training examples.

  • Reduced reward-punishment editing for building ensembles of classifiers

    2011, Expert Systems with Applications
    Citation Excerpt :

    pattern perturbation: each new training set is built changing the patterns that belong to the training set. Some examples of this class are: Bagging (Breiman, 1996), the new training sets S1, … , SK are subsets of the original one; Arcing (Bologna & Appel, 2002), the patterns contained in each new training set are selected according to the probabilities calculated considering the number of times that a given training pattern is misclassified by the previous K − 1 classifiers of the multi-classifier system; Class Switching (Martı´nez-Muñoz & Suárez, 2005), the K training sets are created randomly changing the classes of a subset of the training examples; Decorate (Melville & Mooney, 2005), the K training sets are created by adding artificial patterns whose labels disagree with the current decision of the multi-classifier; Boosting (Freund & Schapire, 1997), a given weight is assigned to each training pattern, the weights are increased at each iteration for the patterns that are difficult to classify (the ith classifier is built considering the patterns that have been difficult to classify for the previous (i − 1) classifier of the ensemble); in Nanni and Lumini (2006) the new training sets S1, … , SK are obtained considering different clusterization of the training patterns. Perturbation of the features1: each new training set is built changing the feature set.

  • Relationship between data size, accuracy, diversity and clusters in neural network ensembles

    2013, International Journal of Computational Intelligence and Applications
View all citing articles on Scopus
View full text