Abstract
Methods involving ensembles of classifiers, such as bagging and boosting, are popular due to the strong theoretical guarantees for their performance and their superior results. Ensemble methods are typically designed by assuming the training data set is static and completely available at training time. As such, they are not suitable for online and incremental learning. In this paper we propose IBoost, an extension of AdaBoost for incremental learning via optimization of an exponential cost function which changes over time as the training data changes. The resulting algorithm is flexible and allows a user to customize it based on the computational constraints of the particular application. The new algorithm was evaluated on stream learning in presence of concept change. Experimental results showed that IBoost achieves better performance than the original AdaBoost trained from scratch each time the data set changes, and that it also outperforms previously proposed Online Coordinate Boost, Online Boost and its non-stationary modifications, Fast and Light Boosting, ADWIN Online Bagging and DWM algorithms.
Chapter PDF
Similar content being viewed by others
References
Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: A new ensemble method for tracking concept drift. In: ICDM, pp. 123–130 (2003)
Scholz, M.: Knowledge-Based Sampling for Subgroup Discovery. In: Local Pattern Detection, pp. 171–189. Springer, Heidelberg (2005)
Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavald, R.: New ensemble methods for evolving data streams. In: ACM SIGKDD (2009)
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proc. ACM SIGKDD, pp. 226–235 (2003)
Chu, F., Zaniolo, C.: Fast and light boosting for adaptive mining of data streams. In: Proc. PAKDD, pp. 282–292 (2004)
Oza, N., Russell, S.: Experimental comparisons of online and batch versions of bagging and boosting. In: ACM SIGKDD (2001)
Pocock, A., Yiapanis, P., Singer, J., Lujan, M., Brown, G.: Online Non-Stationary Boosting. In: Intl. Workshop on Multiple Classifier Systems (2010)
Attar, V., Sinha, P., Wankhade, K.: A fast and light classifier for data streams. Evolving Systems 1(4), 199–207 (2010)
Pelossof, R., Jones, M., Vovsha, I., Rudin, C.: Online Coordinate Boosting. In: On-line Learning for Computer Vision Workshop, ICCV (2009)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. The Annals of Statistics 28, 337–407 (2000)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156 (1996)
Schapire, R.E., Singer, Y.: Improved Boosting Algorithms Using Confidence-rate Predictions. Machine Learning Journal 37, 297–336 (1999)
Schapire, R.E.: The convergence rate of adaboost. In: COLT (2010)
Zliobaite, I.: Learning under Concept Drift: an Overview, Technical Report, Vilnius University, Faculty of Mathematics and Informatics (2009)
Street, W., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. In: ACM SIGKDD, pp. 377–382 (2001)
Weigend, A.S., Mangeas, M., Srivastava, A.N.: Nonlinear gated experts for time series: discovering regimes and avoiding overfitting. In: IJNS, vol. 6, pp. 373–399 (1995)
Bifet, A., Gavald, R.: Learning from time changing data with adaptive windowing. In: SIAM International Conference on Data Mining, pp. 443–448 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grbovic, M., Vucetic, S. (2011). Tracking Concept Change with Incremental Boosting by Minimization of the Evolving Exponential Loss. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23780-5_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-23780-5_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23779-9
Online ISBN: 978-3-642-23780-5
eBook Packages: Computer ScienceComputer Science (R0)