Dynamic adaptation of online ensembles for drifting data streams

Olorunnimbe, M. Kehinde; Viktor, Herna L.; Paquet, Eric

doi:10.1007/s10844-017-0460-9

Dynamic adaptation of online ensembles for drifting data streams

Published: 19 April 2017

Volume 50, pages 291–313, (2018)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

686 Accesses
16 Citations
Explore all metrics

Abstract

The success of data stream mining techniques has allowed decision makers to analyze their data in multiple domains, ranging from monitoring network intrusion to financial markets analysis and online sales transactions exploration. Specifically, online ensembles that construct accurate models against drifting data streams have been developed. Recently, there has been a surge in interest in mobile (or so-called pocket) data stream mining, aiming to construct near real-time models for data stream mining applications that run on mobile devices. In such a setting, it follows that the computational resources are limited and that there is a need to adapt analytics to map the resource usage requirements. Consequently, the resultant models should not only be highly accurate, but they should also adapt swiftly to changes. In addition, the data mining techniques should be fast, scalable, and efficient in terms of resource allocation. It then becomes important to consider Return on Investment (ROI) issues such as storage requirements and memory utilization. This paper introduces the Adaptive Ensemble Size (AES) algorithm, an extension of the Online Bagging method, to address these issues. Our AES method dynamically adapts the sizes of ensembles, based on ROI usage patterns. We illustrate our approach by analyzing the performances against both synthetic and real-world data streams. The results, when comparing our AES algorithm with the state-of-the-art, indicate that we are able to obtain a high Return on Investment (ROI) and to swiftly adapt to change, without compromising on the predictive accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Intelligent Adaptive Ensembles for Data Stream Mining: A High Return on Investment Approach

ROSE: robust online self-adjusting ensemble for continual learning on imbalanced drifting data streams

Article 20 April 2022

A comprehensive ensemble classification techniques detecting and managing concept drift in dynamic imbalanced data streams

Article 23 April 2024

Notes

http://archive.ics.uci.edu/ml/

References

Attar, V., Sinha, P., & Wankhade, K. (2010). A fast and light classifier for data streams. Evolving Systems, 1(3), 199–207.
Article Google Scholar
Bifet, A., Holmes, G., Pfahringer, B., & Gavalda, R.L (2009). Improving adaptive bagging methods for evolving data streams. In Asian conference on machine learning (pp. 23–37).
Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., & Gavalda, R.L (2009). New ensemble methods for evolving data streams. In Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 139–148).
Bifet, A., Holmes, G., & Pfahringer, B. (2010). Leveraging bagging for evolving data streams. In Proceedings of the European conference on machine learning and principles and practice of knowledge discovery in databases, ECML/PKDD (pp. 135–150).
Bifet, A., Holmes, G., Kirkby, R., & Pfahringer, B. (2010). MOA: Massive online analysis. Journal of Machine Learning Research, 11, 1601–1604.
Google Scholar
Bifet, A., & Gavaldà, R. (2007). Learning from time-changing data with adaptive windowing. In SIAM international conference on data mining.
Bifet, A., Read, J., žliobaite, I., Pfahringer, B., & Holmes, G. (2013). Pitfalls in benchmarking data stream classification and how to avoid them. In Machine learning and knowledge discovery in databases (pp. 465–479). Springer, Berlin Heidelberg.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 23–140.
MathSciNet MATH Google Scholar
Brzezinski, D., & Stefanowski, J. (2014). Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Transactions on Neural Networks and Learning Systems, 25(1), 81–94.
Article Google Scholar
Datar, M., Gionis, A., Indyk, P., & Motwani, R. (2002). Maintaining stream statistics over sliding windows. In 13th annual ACM-SIAM symposium on discrete algorithms (pp. 635–644).
Domingos, P., & Hulten, G. (2000). Mining high-speed data streams. In Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’00, NY (pp. 71–80).
Gama, J., Rocha, R., & Medas, P. (2003). Accurate decision trees for mining high-speed data streams. In Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’03 (pp. 523–528).
Gama, J., Medas, P., Castillo, C., & Rodrigues, P. (2004). Learning with drift detection. In Proceedings of the 17th Brazilian symposium on artificial intelligence, SBIA 2004 (Vol. 3171, pp. 286–295). Springer.
Gama, J., Sebastião, R., & Rodrigues, P. (2013). On evaluating stream learning algorithms. Machine Learning, 90(3), 317–346.
Article MathSciNet MATH Google Scholar
Gaber, M. M., Stahl, F., & Gomes, J.B. (2014). Pocket data mining: big data on small devices, Studies in Big Data 2. Berlin: Springer.
Book Google Scholar
Hansen, L.K., & Salamon, P. (1990). Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10), 993–1001.
Article Google Scholar
Haghighi, P.D., Zaslavsky, A., Krishnaswamy, S., Gaber, M. M., & Loke, S. (2009). Context-aware adaptive data stream mining. Intelligent Data Analysis, 13 (3), 423–434.
Google Scholar
Ikonomovska, E. (2011). Airline dataset. http://kt.ijs.si/elena_ikonomovska/data.html. (Visited on 01/20/2015).
Kargupta, H., Hoon, P., Pittie, S., & Liu, L. (2002). Mobimine: monitoring the stock market from a PDA. ACM SIGKDD Explorations, 3, 37–47.
Article Google Scholar
Krzywinski, M., & Altman, N. (2014). Points of significance: visualizing samples with box plots. In Nature Methods 11 (pp. 119–120). doi:10.1038/nmeth.2813.
Massart, D. L., Smeyers-verbeke, A. J., Capron, & Schlesier, K. (2005). Visual presentation of data by means of box plots.
Kolter, J.Z., & Maloof, M.A. (2007). Dynamic weighted majority: an ensemble method for drifting concepts. Journal of Machine Learning Research, 8, 2755–2790.
MATH Google Scholar
Krishnaswamy, S., Gama, J., & Gaber, M.M. (2012). Mobile data mining: from algorithms to applications. In IEEE 13th international conference on mobile data management (MDM) (pp. 360–363).
Opitz, D., & Maclin, R. (1999). Popular ensemble methods: an empirical study. Journal of Artificial Intelligence Research, 11, 169–198.
MATH Google Scholar
Oza, N.C., & Russell, S. (2001). Online bagging and boosting. In Artificial intelligence and statistics (pp. 105–112).
Oza, N.C., & Russell, S. (2001). Experimental comparisons of online and batch versions of bagging and boosting. In Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’01 (pp. 359–364).
van Rijn, J., Holmes, G., Pfahringer, B., & Vanschoren, J. (2014). Algorithm selection on data streams. In S. Džeroski, P. Panov, D. Kocev, L. Todorovski (Eds.) Discovery science, lecture notes in computer science (Vol. 8777, pp. 325–336). Springer International Publishing.
Weiss, G. M., Bianca, Z., & Maytal, S.-T. (2008). Guest editorial: special issue on utility-based data mining. Data Mining and Knowledge Discovery, 17(2), 129–135.
Article MathSciNet Google Scholar
žliobaite, I., Budka, M., & Stahl, F. (2015). Towards cost-sensitive adaptation: When is it worth updating your predictive model? Neurocomputing, 150(Part A(0)), 240–249.
Article Google Scholar
žliobaite, I. (2013). How good is the Electricity benchmark for evaluating concept drift adaptation, arXiv:1301.3524.
žliobaite, I., Bifet, A., Pfahringer, B., & Holmes, G. (2014). Active learning with drifting streaming data. IEEE Transactions on Neural Networks and Learning Systems, 25(1), 27–39.
Article Google Scholar
žliobaite, I. (2010). Learning until concept drift: a review, Vilnius University, Technical Report, arxiv:1010.4784.

Download references

Author information

Authors and Affiliations

School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON, K1N 6N5, Canada
M. Kehinde Olorunnimbe, Herna L. Viktor & Eric Paquet
National Research Council of Canada, Ottawa, ON, K1A 0R6, Canada
Eric Paquet

Authors

M. Kehinde Olorunnimbe
View author publications
You can also search for this author in PubMed Google Scholar
Herna L. Viktor
View author publications
You can also search for this author in PubMed Google Scholar
Eric Paquet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Herna L. Viktor.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Olorunnimbe, M.K., Viktor, H.L. & Paquet, E. Dynamic adaptation of online ensembles for drifting data streams. J Intell Inf Syst 50, 291–313 (2018). https://doi.org/10.1007/s10844-017-0460-9

Download citation

Received: 17 June 2016
Revised: 19 February 2017
Accepted: 03 April 2017
Published: 19 April 2017
Issue Date: April 2018
DOI: https://doi.org/10.1007/s10844-017-0460-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic adaptation of online ensembles for drifting data streams

Abstract

Access this article

Similar content being viewed by others

Intelligent Adaptive Ensembles for Data Stream Mining: A High Return on Investment Approach

ROSE: robust online self-adjusting ensemble for continual learning on imbalanced drifting data streams

A comprehensive ensemble classification techniques detecting and managing concept drift in dynamic imbalanced data streams

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dynamic adaptation of online ensembles for drifting data streams

Abstract

Access this article

Similar content being viewed by others

Intelligent Adaptive Ensembles for Data Stream Mining: A High Return on Investment Approach

ROSE: robust online self-adjusting ensemble for continual learning on imbalanced drifting data streams

A comprehensive ensemble classification techniques detecting and managing concept drift in dynamic imbalanced data streams

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation