Abstract
Data Streams (DS) pose a challenge for any machine learning algorithm, because of high volume of data - on the order of millions of instances for a typical data set. Various algorithms were proposed, in particular, OzaBoost - a parallel adaptation of AdaBoost - creates various “weak” learners in parallel and updates each of them with new instances during training. At any moment, OzaBoost can stop and output the final model. OzaBoost suffers with memory consumption, which avoids its use for certain types of problems. This work introduces OzaBoost Dynamic, which changes the weight calculation and the number of boosted “weak” learners used by OzaBoost to improve its performance in terms of memory consumption. This work presents the empirical results showing the performance of all algorithms using data sets with 50 and 60 million instances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: Data stream mining: a practical approach. Technical report, University of Waikato (May 2011)
de Souza, É.N., Matwin, S.: Extending adaBoost to iteratively vary its base classifiers. In: Butz, C., Lingras, P. (eds.) Canadian AI 2011. LNCS (LNAI), vol. 6657, pp. 384–389. Springer, Heidelberg (2011)
de Souza, E.N., Matwin, S.: Improvements to adaBoost dynamic. In: Kosseim, L., Inkpen, D. (eds.) Canadian AI 2012. LNCS (LNAI), vol. 7310, pp. 293–298. Springer, Heidelberg (2012)
Domingos, P., Hulten, G.: Mining high-speed data streams. In: KDD, pp. 71–80 (2000)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Oza, N.C., Russell, S.: Online bagging and boosting. In: Jaakkola, T., Richardson, T. (eds.) 8th International Workshop on Artificial Intelligence and Statistics, Key West, Florida, USA, pp. 105–112. M. Kaufmann (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de Souza, E.N., Matwin, S. (2013). Improvements to Boosting with Data Streams. In: Zaïane, O.R., Zilles, S. (eds) Advances in Artificial Intelligence. Canadian AI 2013. Lecture Notes in Computer Science(), vol 7884. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38457-8_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-38457-8_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38456-1
Online ISBN: 978-3-642-38457-8
eBook Packages: Computer ScienceComputer Science (R0)