Abstract
Concept drift is a big challenge in data stream mining (including process mining) since it seriously decreases the accuracy of a model in online learning problems. Model adaptation to changes in data distribution before making new predictions is very necessary. This paper proposes a novel ensemble method called E-ERICS, which combines multiple Bayesian-optimized ERICS models into one model and uses a voting mechanism to determine whether each instance of a data stream is a concept drift point or not. The experimental results on the synthetic and classic real-world streaming datasets showed that the proposed method is much more precise and more sensitive (shown in F1-score, precision, and recall metrics) than the original ERICS models in detecting concept drift, especially a sudden drift.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Abbasi, A., Javed, A.R., Chakraborty, C., Nebhen, J., Zehra, W., Jalil, Z.: ElStream: an ensemble learning approach for concept drift detection in dynamic social big data stream learning. In IEEE Access 9, 66408–66419 (2021)
Althabiti, M., Abdullah, M.: Streaming data classification with concept drift. Bioscience Biotechnology Research Communications 12(1) (2019)
Baena-Garcia, M., del Campo-Avila, J., Fidalgo, R., Bifet, A., Gavalda, R., Morales-Bueno, R.: Early drift detection method. In: Fourth int. workshop on knowledge discovery from data streams, vol. 6, pp. 77–86 (2006)
Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 443–448. SIAM (2007)
de Barros, R.S.M., de Carvalho Santos, S.G.T.: An overview and comprehensive comparison of ensembles for concept drift. In: Information Fusion, vol. 52, pp. 213–244 (2019)
Dietterich, T.G.: Ensemble methods in machine learning. In: International workshop on multiple classifier systems, pp. 1–15. Springer, Berlin, Heidelberg (2000)
Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml, last accessed 30 March 2022
Gama, J., Medas, P., C, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29
Gulcan, E.B., Can, F.: Implicit Concept Drift Detection for Multi-label Data Streams. arXiv preprint, arXiv:2202.00070v1 (2022)
Haug, J., Kasneci, G.: Learning parameter distributions to detect concept drift in data streams. In: 2020 25th International Conference on Pattern Recognition (ICPR). IEEE (2021)
Imbrea, A.: Automated Machine Learning Techniques for Data Streams. arXiv preprint, arXiv:2106.07317v1 (2021)
Montiel, J., Read, J., Bifet, A., Abdessalem, T.: Scikit-multiflow: A multi-output streaming framework. The Journal of Machine Learning Research 19(72), 1–5 (2018)
Museba, T., Nelwamondo, F., Ouahada, K., Akinola, A.: Recurrent adaptive classifier en-semble for handling recurring concept drifts. In: Applied Computational Intelligence and Soft Computing, vol. 2021, pp. 1–13 (2021)
Nogueira, F.: Bayesian Optimization: Open source constrained global optimization tool for Python (2014). https://github.com/fmfn/BayesianOptimization
Raab, C., Heusinger, M., Schleif, F.M.: Reactive soft prototype computing for concept drift streams. Neurocomputing, vol. 416, pp. 340–351. Elsevier (2020)
Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. The MIT Press (2006)
Webb, G.I., Hyde, R., Cao, H., Nguyen, H.L., Petitjean, F.: Characterizing concept drift. Data Min. Knowl. Disc. 30(4), 964–994 (2016). https://doi.org/10.1007/s10618-015-0448-4
Yu, H., Liu, T., Lu, J., Zhang, G.: Automatic Learning to Detect Concept Drift. arXiv preprint, arXiv: 2105.01419v1 (2021)
Yu, S., Abraham, Z.: Concept drift detection with hierarchical hypothesis testing. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 768–776. SIAM (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nguyen, KT., Tran, T., Nguyen, AD., Phan, XH., Ha, QT. (2022). Parameter Distribution Ensemble Learning for Sudden Concept Drift Detection. In: Nguyen, N.T., Tran, T.K., Tukayev, U., Hong, TP., Trawiński, B., Szczerbicki, E. (eds) Intelligent Information and Database Systems. ACIIDS 2022. Lecture Notes in Computer Science(), vol 13758. Springer, Cham. https://doi.org/10.1007/978-3-031-21967-2_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-21967-2_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21966-5
Online ISBN: 978-3-031-21967-2
eBook Packages: Computer ScienceComputer Science (R0)