Abstract
Ensemble methods take their output from a set of base predictors. The ensemble accuracy depends on two factors: the base classifiers accuracy and their diversity (how different these base classifiers outputs are from each other). An approach for increasing the diversity of the base classifiers is presented in this paper. The method builds some new features to be added to the training dataset of the base classifier. Those new features are computed using a Nearest Neighbor (NN) classifier built from a few randomly selected instances. The NN classifier returns: (i) an indicator pointing the nearest neighbor and, (ii) the class this NN predicts for the instance. We tested this idea using decision trees as base classifiers . An experimental validation on 62 UCI datasets is provided for traditional ensemble methods, showing that ensemble accuracy and base classifiers diversity are usually improved.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aha, D., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)
Asuncion, A., Newman, D.J.: UCI machine learning repository., http://www.ics.uci.edu/~mlearn/MLRepository.html
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach. Learn. 36(1-2), 105–139 (1999)
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Bruno Caprile, B., Merler, S., Furlanello, C., Jurman, G.: Exact bagging with k-nearest neighbour classifiers. In: Roli, F., Kittler, J., Windeatt, T. (eds.) MCS 2004. LNCS, vol. 3077, pp. 72–81. Springer, Heidelberg (2004)
Dietterich, T.G.: Approximate statistical test for comparing supervised classification learning algorithms. Neural Comp. 10(7), 1895–1923 (1998)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Domeniconi, C., Yan, B.: Nearest neighbor ensemble. In: Kittler, J., Petrou, M., Nixon, M.S. (eds.) Proc. 17th Int. Conf. Patt. Recogn., Cambridge, UK, pp. 228–231. IEEE Comp. Soc., Los Alamitos (2004)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Saitta, L. (ed.) Proc. 13th Int. Conf. Mach. Learn., Bari, Italy, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Gama, J., Brazdil, P.: Cascade generalization. Mach. Learn. 41(3), 315–343 (2000)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Patt. Analysis Mach. Intell. 20(8), 832–844 (1998)
Kuncheva, L., Whitaker, C.J.: Using diversity with three variants of boosting: aggressive, conservative, and inverse. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 81–90. Springer, Heidelberg (2002)
Margineantu, D.D., Dietterich, T.G.: Pruning adaptive boosting. In: Fisher, D.H. (ed.) Proc. 14th Int. Conf. Mach. Learn., Nashville, TN, pp. 211–218. Morgan Kaufmann, San Francisco (1997)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kauffman, San Francisco (1993)
Webb, G.I.: MultiBoosting: a technique for combining boosting and wagging. Mach. Learn. 40(2), 159–196 (2000)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Maudes, J., Rodríguez, J.J., García-Osorio, C. (2009). Disturbing Neighbors Diversity for Decision Forests. In: Okun, O., Valentini, G. (eds) Applications of Supervised and Unsupervised Ensemble Methods. Studies in Computational Intelligence, vol 245. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03999-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-03999-7_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03998-0
Online ISBN: 978-3-642-03999-7
eBook Packages: EngineeringEngineering (R0)