AveBoost2: Boosting for Noisy Data

Oza, Nikunj C.

doi:10.1007/978-3-540-25966-4_3

Nikunj C. Oza¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3077))

Included in the following conference series:

International Workshop on Multiple Classifier Systems

1783 Accesses
27 Citations

Abstract

AdaBoost [4] is a well-known ensemble learning algorithm that constructs its base models in sequence. AdaBoost constructs a distribution over the training examples to create each base model. This distribution, represented as a vector, is constructed with the goal of making the next base model’s mistakes uncorrelated with those of the previous base model [5]. We previously [5] developed an algorithm, AveBoost, that first constructed a distribution the same way as AdaBoost but then averaged it with the previous models’ distributions to create the next base model’s distribution. Our experiments demonstrated the superior accuracy of this approach. In this paper, we slightly revise our algorithm to obtain non-trivial theoretical results: bounds on the training error and generalization error (difference between training and test error). Our averaging process has a regularizing effect which leads us to a worse training error bound for our algorithm than for AdaBoost but a better generalization error bound. This leads us to suspect that our new algorithm works better than AdaBoost on noisy data. For this paper, we experimented with the data that we used in [7] both as originally supplied and with added label noise – some of the data has its original label changed randomly. Our algorithm’s experimental performance improvement over AdaBoost is even greater on the noisy data than the original data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

An Unsupervised Boosting Strategy for Outlier Detection Ensembles

A Study on the Noise Label Influence in Boosting Algorithms: AdaBoost, GBM and XGBoost

Auto-sklearn: Efficient and Robust Automated Machine Learning

References

Blake, C., Keogh, E., Merz, C.J.: UCI repository of machine learning databases (1999), http://www.ics.uci.edu/~mlearn/MLRepository.html
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH MathSciNet Google Scholar
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning 40, 139–158 (2000)
Article Google Scholar
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning, Bari, Italy, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Google Scholar
Kivinen, J., Warmuth, M.K.: Boosting as entropy projection. In: Proceedings of the Twelfth Annual Conference on Computational Learning Theory, pp. 134–144 (1999)
Google Scholar
Kutin, S., Niyogi, P.: The interaction of stability and weakness in adaboost. Technical Report TR-2001-30, University of Chicago (October 2001)
Google Scholar
Oza, N.C.: Boosting with averaged weight vectors. In: Windeatt, T., Roli, F. (eds.) Proceedings of the Fourth International Workshop on Multiple Classifier Systems, pp. 15–24. Springer, Berlin (2003)
Chapter Google Scholar
Rätsch, G., Onoda, T., Müller, K.R.: Soft margins for adaboost. Machine Learning 42, 287–320 (2001)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Computational Sciences Division, NASA Ames Research Center, Mail Stop 269-1, Moffett Field, CA, 94035-1000, USA
Nikunj C. Oza

Authors

Nikunj C. Oza
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical and Electronic Engineering, Piazza d’Armi, University of Cagliari, 09123, Cagliari, Italy
Fabio Roli
Centre for Vision, Speech and Signal Processing, University of Surrey, GU2 7XH, Guildford, UK
Josef Kittler
Centre for Vision, Speech and Signal Proc (CVSSP), University of Surrey, GU2 7XH, Guildford, Surrey, United Kingdom
Terry Windeatt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Oza, N.C. (2004). AveBoost2: Boosting for Noisy Data. In: Roli, F., Kittler, J., Windeatt, T. (eds) Multiple Classifier Systems. MCS 2004. Lecture Notes in Computer Science, vol 3077. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25966-4_3

Download citation

DOI: https://doi.org/10.1007/978-3-540-25966-4_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22144-9
Online ISBN: 978-3-540-25966-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics