Rapid and brief communicationFuzzyBagging: A novel ensemble of classifiers
Introduction
Multiclassifier systems are special cases of approaches that integrate several data-driven models for the same problem. A key goal is to obtain a better composite global model, with more accurate and reliable estimates. The combination methods proposed in the literature are based on “voting” rules, statistical techniques, belief functions, and other “classifier fusion” schemes. As an example, the “majority” voting rule interprets each classification result as a “vote” for one of the data classes and assigns the input pattern to the class receiving the majority of votes. Such methods assume that, for each pattern, different classifiers make different classification errors. Recently, a number of classifier combination methods, called ensemble methods, have been proposed in the field of machine learning [1]. Given a single classifier, called the base classifier, a set of classifiers can be automatically generated. In this paper, we propose a new ensemble creation method that is based on Bagging [1] and Fuzzy C-Means clustering [2]. Bagging, given a training set of size n, generates K new training sets , each of size n, by randomly drawing elements of the original training set, where the same element may be drawn multiple times. In this paper, the new training sets are generated dividing into K overlapping clusters the training set , the training set is build using the patterns that belong to ith cluster. We train a classifier for each training set , we classify each pattern of the test set using these classifiers, then we combine these classifiers with the “max-rule”.
Section snippets
System
The new training sets are generated partitioning into K clusters the training set , a different training set is build using the patterns that belong to th cluster. We assign to a cluster , the 63.2% of patterns of each class with higher membership to that cluster.
We have chosen this value, since in the “base” Bagging, if the probability of being drawn is equally distributed over the training set , 63.2 is the percentage of training elements contained in each modified training set
Experimental results
We conducted our experiments on 4 datasets of varying complexity, 3 coming from the UCI Repository of Machine Learning Databases [3] and the remaining (NIST4) from [4]. A summary of the characteristics of these datasets (number of attributes, number of examples, number of classes) is reported in Table 1. For all the databases but NIST4 the same testing protocol has been adopted: we averaged the results over 10 tests, each time randomly resampling training and test sets (containing half of the
Conclusions
In this work, we propose to use Fuzzy C-Means to obtain several training set and then train a classifier on each training set. We combine the output of classifiers using the “max rule”. The experimental results prove that our new approach is a successful attempt to obtain an error reduction with respect to the performance of standard Bagging. We plan in the future to evaluate the performance of our FuzzyBagging as a function of the percentage of training elements contained in each modified
Acknowledgements
This work has been supported by Italian PRIN prot. 2004098034 and by European Commission IST-2002-507634 Biosecure NoE projects.
References (5)
Bagging predictors
Mach. Learn.
(1996)Pattern Recognition with Fuzzy Objective Function Algorithms
(1981)
Cited by (31)
Effect of ensemble classifier composition on offline cursive character recognition
2013, Information Processing and ManagementCitation Excerpt :The class chosen by most base classifiers is the final verdict of the ensemble classifier in majority voting. There are a number of variants of bagging and aggregation approaches including random forests (Breiman, 2001), ordered aggregation (Munoz, Lobato, & Suarez, 2009), and fuzzy bagging (Nanni & Lumini, 2006). In boosting (Schapire, 1990), the base classifiers are also trained on subsets of the training data that are created by re-sampling the training examples.
Reduced reward-punishment editing for building ensembles of classifiers
2011, Expert Systems with ApplicationsCitation Excerpt :pattern perturbation: each new training set is built changing the patterns that belong to the training set. Some examples of this class are: Bagging (Breiman, 1996), the new training sets S1, … , SK are subsets of the original one; Arcing (Bologna & Appel, 2002), the patterns contained in each new training set are selected according to the probabilities calculated considering the number of times that a given training pattern is misclassified by the previous K − 1 classifiers of the multi-classifier system; Class Switching (Martı´nez-Muñoz & Suárez, 2005), the K training sets are created randomly changing the classes of a subset of the training examples; Decorate (Melville & Mooney, 2005), the K training sets are created by adding artificial patterns whose labels disagree with the current decision of the multi-classifier; Boosting (Freund & Schapire, 1997), a given weight is assigned to each training pattern, the weights are increased at each iteration for the patterns that are difficult to classify (the ith classifier is built considering the patterns that have been difficult to classify for the previous (i − 1) classifier of the ensemble); in Nanni and Lumini (2006) the new training sets S1, … , SK are obtained considering different clusterization of the training patterns. Perturbation of the features1: each new training set is built changing the feature set.
Creating ensembles of classifiers via fuzzy clustering and deflection
2010, Fuzzy Sets and SystemsInput Decimated Ensemble based on Neighborhood Preserving Embedding for spectrogram classification
2009, Expert Systems with ApplicationsEnsemble generation and feature selection for the identification of students with learning disabilities
2009, Expert Systems with ApplicationsRelationship between data size, accuracy, diversity and clusters in neural network ensembles
2013, International Journal of Computational Intelligence and Applications