Skip to main content

Improvements to Boosting with Data Streams

  • Conference paper
Advances in Artificial Intelligence (Canadian AI 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7884))

Included in the following conference series:

Abstract

Data Streams (DS) pose a challenge for any machine learning algorithm, because of high volume of data - on the order of millions of instances for a typical data set. Various algorithms were proposed, in particular, OzaBoost - a parallel adaptation of AdaBoost - creates various “weak” learners in parallel and updates each of them with new instances during training. At any moment, OzaBoost can stop and output the final model. OzaBoost suffers with memory consumption, which avoids its use for certain types of problems. This work introduces OzaBoost Dynamic, which changes the weight calculation and the number of boosted “weak” learners used by OzaBoost to improve its performance in terms of memory consumption. This work presents the empirical results showing the performance of all algorithms using data sets with 50 and 60 million instances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: Data stream mining: a practical approach. Technical report, University of Waikato (May 2011)

    Google Scholar 

  2. de Souza, É.N., Matwin, S.: Extending adaBoost to iteratively vary its base classifiers. In: Butz, C., Lingras, P. (eds.) Canadian AI 2011. LNCS (LNAI), vol. 6657, pp. 384–389. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  3. de Souza, E.N., Matwin, S.: Improvements to adaBoost dynamic. In: Kosseim, L., Inkpen, D. (eds.) Canadian AI 2012. LNCS (LNAI), vol. 7310, pp. 293–298. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  4. Domingos, P., Hulten, G.: Mining high-speed data streams. In: KDD, pp. 71–80 (2000)

    Google Scholar 

  5. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  6. Oza, N.C., Russell, S.: Online bagging and boosting. In: Jaakkola, T., Richardson, T. (eds.) 8th International Workshop on Artificial Intelligence and Statistics, Key West, Florida, USA, pp. 105–112. M. Kaufmann (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

de Souza, E.N., Matwin, S. (2013). Improvements to Boosting with Data Streams. In: Zaïane, O.R., Zilles, S. (eds) Advances in Artificial Intelligence. Canadian AI 2013. Lecture Notes in Computer Science(), vol 7884. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38457-8_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38457-8_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38456-1

  • Online ISBN: 978-3-642-38457-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics