Skip to main content

A Framework for Classification in Data Streams Using Multi-strategy Learning

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9956))

Abstract

Adaptive online learning algorithms have been successfully applied to fast-evolving data streams. Such streams are susceptible to concept drift, which implies that the most suitable type of classifier often changes over time. In this setting, a system that is able to seamlessly select the type of learner that presents the current “best” model holds much value. For example, in a scenario such as user profiling for security applications, model adaptation is of the utmost importance. We have implemented a multi-strategy framework, the so-called Tornado environment, which is able to run multiple and diverse classifiers simultaneously for decision making. In our framework, the current learner with the highest performance, at a specific point in time, is selected and the corresponding model is then provided to the user. In our implementation, we employ an Error-Memory-Runtime (EMR) measure which combines the error-rate, the memory usage and the runtime of classifiers as a performance indicator. We conducted experiments on synthetic and real-world datasets with the Hoeffding Tree, Naive Bayes, Perceptron, K-Nearest Neighbours and Decision Stumps algorithms. Our results indicate that our environment is able to adapt to changes and to continuously select the best current type of classifier, as the data evolve.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Gama, J., Zliobaite, I., Bifet, A., Pecheniziky, M., Bouchachia, A.: A survey on concept drift adaptation. J. ACM Comput. Surv. 46(4), 1–37 (2014)

    Article  MATH  Google Scholar 

  2. Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2001)

    Google Scholar 

  3. Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). doi:10.1007/978-3-540-28645-5_29

    Chapter  Google Scholar 

  4. Gama, J., Fernandes, R., Rocha, R.: Decision trees for mining data streams. J. Intell. Data Anal. 10(1), 23–45 (2006)

    Google Scholar 

  5. Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: SIAM International Conference on Data Mining, pp. 443–448 (2007)

    Google Scholar 

  6. Huang, D.T.J., Koh, Y.S., Dobbie, G., Bifet, A.: Drift detection using stream volatility. In: Appice, A., Rodrigues, P.P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9284, pp. 417–432. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23528-8_26

    Chapter  Google Scholar 

  7. Koren, Y.: Collaborative filtering with temporal dynamics. In: 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 447–456 (2009)

    Google Scholar 

  8. Lee, W., Stolfo, S.J., Mok, K.W.: Adaptive intrusion detection: A data mining approach. J. Artif. Intell. Rev. 14(6), 533–567 (2000)

    Article  MATH  Google Scholar 

  9. Stavens, D., Hoffmann, G., Thrun, S.: Online speed adaptation using supervised learning for high-speed, off-road autonomous driving. In: 20th International Joint Conference on Artificial Intelligence, pp. 2218–2224 (2007)

    Google Scholar 

  10. Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavalda, R.: New ensemble methods for evolving data streams. In: 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 139–148 (2009)

    Google Scholar 

  11. Bifet, A., Holmes, G., Pfahringer, B., Frank, E.: Fast perceptron decision tree learning from evolving data streams. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) PAKDD 2010. LNCS (LNAI), vol. 6119, pp. 299–310. Springer, Heidelberg (2010). doi:10.1007/978-3-642-13672-6_30

    Chapter  Google Scholar 

  12. Zliobaite, I., Budka, M., Stahl, F.: Towards cost-sensitive adaptation: when is it worth updating your predictive model? Neurocomputing 150, 240–249 (2015)

    Article  Google Scholar 

  13. Olorunnimbe, M.K., Viktor, H.L., Paquet, E.: Intelligent adaptive ensembles for data stream mining: A high return on investment approach. In: Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds.) NFMCP 2015. LNCS (LNAI), vol. 9607, pp. 61–75. Springer, Heidelberg (2016). doi:10.1007/978-3-319-39315-5_5

    Chapter  Google Scholar 

  14. Gaber, M., Stahl, F., Gomes, J.B.: Pocket Data Mining: Big Data on Small Devices. Studies in Big Data. Springer, Heidelberg (2014)

    Book  Google Scholar 

  15. Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: Massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)

    Google Scholar 

  16. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    Article  Google Scholar 

  17. Kubat, M., Widmer, G.: Adapting to drift in continous domain. In: 8th European Conference on Machine Learning, pp. 307–310. Springer, Heidelberg (1995)

    Google Scholar 

  18. Gama, J., Sebastiao, R., Rodrigues, P.P.: On evaluating stream learning algorithms. J. Mach. Learn. 90(3), 317–346 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  19. Domingos, P., Hulten, G.: Mining high-speed data streams. In: 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)

    Google Scholar 

  20. Lichman., M.: UCI Machine Learning Repository. University of California Irvine, School of Information and Computer Science (2013)

    Google Scholar 

  21. Zupan, B., Bohanec, M., Bratko, I., Demsar, J.: Machine learning by function decomposition. In: International Conference on Machine Learning (ICML), pp. 421–429 (1997)

    Google Scholar 

  22. Harries, M.: Splice-2 Comparative Evaluation: Electricity Pricing. Technical Report, University of New South Wales, Australia (1999)

    Google Scholar 

  23. Cattral, R., Oppacher, F., Deugo, D.: Evolutionary data mining with automatic rule generalization. In: Recent Advances in Computers, Computing and Communications pp. 296–300 (2002)

    Google Scholar 

  24. Kohavi, P.: Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid. In: 2nd International Conference on Knowledge Discovery and Data Mining (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ali Pesaranghader or Eric Paquet .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Pesaranghader, A., Viktor, H.L., Paquet, E. (2016). A Framework for Classification in Data Streams Using Multi-strategy Learning. In: Calders, T., Ceci, M., Malerba, D. (eds) Discovery Science. DS 2016. Lecture Notes in Computer Science(), vol 9956. Springer, Cham. https://doi.org/10.1007/978-3-319-46307-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46307-0_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46306-3

  • Online ISBN: 978-3-319-46307-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics