Abstract
Ensemble methods like Bagging and Boosting which combine the decisions of multiple hypotheses are among the strongest existing machine learning methods. The diversity of the members of an ensemble is known to be an important factor in determining its generalization error. We present a new method for generating ensembles, named CDEBMTE (Creation of Diverse Ensemble Based on Manipulation of Training Examples), that directly constructs diverse hypotheses using manipulation of training examples in three ways: (1) sub-sampling training examples, (2) decreasing/increasing error-prone training examples and (3) decreasing/increasing neighbor samples of error-prone training examples. Experimental results using two well-known classifiers as two base learners demonstrate that this approach consistently achieves higher predictive accuracy than both the base classifier, Adaboost and Bagging. CDEBMTE also outperforms Adaboost more prominent when training data size is becomes larger.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning 40(2), 139–157 (2000)
Freund, Y., Schapire, R.E.: A decision–theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55, 119–139 (1997)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Saitta, L. (ed.) Proceedings of the Thirteenth International Conference on Machine Learning (ICML 1996). Morgan Kaufmann (1996)
Hansen, L.K., Salamon, P.: Neural network ensembles. IEEE Transaction on Pattern Analysis and Machine Intelligence 12, 993–1001 (1990)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2001)
Kuncheva, L.I.: Combining Pattern Classifiers, Methods and Algorithms. Wiley, New York (2005)
Liu, Y., Yao, X.: Ensemble learning via negative correlation. Neural Networks 12 (1999)
Melville, P., Mooney, R.: Constructing Diverse Classifier Ensembles Using Artificial Training Examples. In: Proc. of the IJCAI, vol. I, pp. 505–510 (2003)
Melville, P.: Creating Diverse Ensemble Classifiers (2006)
Newman, C.B.D.J., Hettich, S., Merz, C.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLSummary.html
Qiao, X., Liu, Y.: Adaptive Weighted Learning for Unbalanced Multicategory Classification. Biometrics, 159–168 (2009)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Tumer, K., Ghosh, J.: Error correlation and error reduction in ensemble classifiers. Connection Science 8(3-4), 385–403 (1996)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Parvin, H., Parvin, S., Rezaei, Z., Mohamadi, M. (2012). CDEBMTE: Creation of Diverse Ensemble Based on Manipulation of Training Examples. In: Omatu, S., De Paz Santana, J., González, S., Molina, J., Bernardos, A., RodrÃguez, J. (eds) Distributed Computing and Artificial Intelligence. Advances in Intelligent and Soft Computing, vol 151. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28765-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-28765-7_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28764-0
Online ISBN: 978-3-642-28765-7
eBook Packages: EngineeringEngineering (R0)