Abstract
AdaBoost [4] is a well-known ensemble learning algorithm that constructs its base models in sequence. AdaBoost constructs a distribution over the training examples to create each base model. This distribution, represented as a vector, is constructed with the goal of making the next base model’s mistakes uncorrelated with those of the previous base model [5]. We previously [5] developed an algorithm, AveBoost, that first constructed a distribution the same way as AdaBoost but then averaged it with the previous models’ distributions to create the next base model’s distribution. Our experiments demonstrated the superior accuracy of this approach. In this paper, we slightly revise our algorithm to obtain non-trivial theoretical results: bounds on the training error and generalization error (difference between training and test error). Our averaging process has a regularizing effect which leads us to a worse training error bound for our algorithm than for AdaBoost but a better generalization error bound. This leads us to suspect that our new algorithm works better than AdaBoost on noisy data. For this paper, we experimented with the data that we used in [7] both as originally supplied and with added label noise – some of the data has its original label changed randomly. Our algorithm’s experimental performance improvement over AdaBoost is even greater on the noisy data than the original data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Blake, C., Keogh, E., Merz, C.J.: UCI repository of machine learning databases (1999), http://www.ics.uci.edu/~mlearn/MLRepository.html
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning 40, 139–158 (2000)
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning, Bari, Italy, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Kivinen, J., Warmuth, M.K.: Boosting as entropy projection. In: Proceedings of the Twelfth Annual Conference on Computational Learning Theory, pp. 134–144 (1999)
Kutin, S., Niyogi, P.: The interaction of stability and weakness in adaboost. Technical Report TR-2001-30, University of Chicago (October 2001)
Oza, N.C.: Boosting with averaged weight vectors. In: Windeatt, T., Roli, F. (eds.) Proceedings of the Fourth International Workshop on Multiple Classifier Systems, pp. 15–24. Springer, Berlin (2003)
Rätsch, G., Onoda, T., Müller, K.R.: Soft margins for adaboost. Machine Learning 42, 287–320 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oza, N.C. (2004). AveBoost2: Boosting for Noisy Data. In: Roli, F., Kittler, J., Windeatt, T. (eds) Multiple Classifier Systems. MCS 2004. Lecture Notes in Computer Science, vol 3077. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25966-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-25966-4_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22144-9
Online ISBN: 978-3-540-25966-4
eBook Packages: Springer Book Archive