ABSTRACT
Bagging and boosting are well-known ensemble learning methods. They combine multiple learned base models with the aim of improving generalization performance. To date, they have been used primarily in batch mode, i.e., they require multiple passes through the training data. In previous work, we presented online bagging and boosting algorithms that only require one pass through the training data and presented experimental results on some relatively small datasets. Through additional experiments on a variety of larger synthetic and real datasets, this paper demonstrates that our online versions perform comparably to their batch counterparts in terms of classification accuracy. We also demonstrate the substantial reduction in running time we obtain with our online algorithms because they require fewer passes through the training data.
- 1.S.D. Bay. The UCI KDD archive, 1999. (URL: http://kdd.ics.uci.edu).Google Scholar
- 2.C. Blake, E. Keogh, and C.J. Merz. UCI repository of machine learning databases, 1999. (URL: http: / /www.ics.uci.edu /~mlearn /MLRepository.htmI).Google Scholar
- 3.L. Breiman. Bagging predictors. Machine Learning, 24(2):123-140, 1996. Google ScholarCross Ref
- 4.Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119-139, 1997. Google ScholarDigital Library
- 5.O. L. Mangasarian, R. Setiono, and W. H. Wolberg. Pattern recognition via linear programming: Theory and application to medical diagnosis. In Thomas F. Coleman and Yuying Li, editors, Large-Scale Numerical Optimization, pages 22-30. SIAM Publications, 1990.Google Scholar
- 6.Nikunj C. Oza and Stuart Russell. Experimental comparisons of online and batch versions of bagging and boosting. Technical report, Electrical Engineering and Computer Science Department, University of California, Berkeley, CA. In prepaxation.Google Scholar
- 7.Nikunj C. Oza and Stuart Russell. Online bagging and boosting. In Artificial Intelligence and Statistics 2001, pages 105-112. Morgan Kanfmann, 2001.Google Scholar
- 8.P.E. Utgoff, N.C. Berkman, and J.A. Clouse. Decision tree induction based on efficient tree restructuring. Machine Learning, 29(1):5-44, 1997. Google ScholarDigital Library
Index Terms
- Experimental comparisons of online and batch versions of bagging and boosting
Recommendations
Using boosting to prune bagging ensembles
Boosting is used to determine the order in which classifiers are aggregated in a bagging ensemble. Early stopping in the aggregation of the classifiers in the ordered bagging ensemble allows the identification of subensembles that require less memory ...
Bagging, boosting, and C4.S
AAAI'96: Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1Breiman's bagging and Freund and Schapire's boosting are recent methods for improving the predictive power of classifier learning systems. Both form a set of classifiers that are combined by voting, bagging by generating replicated bootstrap samples of ...
Classification Performance of Bagging and Boosting Type Ensemble Methods with Small Training Sets
AbstractClassification performance of an ensemble method can be deciphered by studying the bias and variance contribution to its classification error. Statistically, the bias and variance of a single classifier is controlled by the size of the training ...
Comments