Skip to main content
Log in

A hybrid block-based ensemble framework for the multi-class problem to react to different types of drifts

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Data stream mining is an important research topic that has received increasing attention due to its use in a wide range of applications, such as sensor networks, banking, and telecommunication. The phenomenon of data streams evolving over time is known as concept drift. In addition, the presence of multiple classes aggravates the problem of a loss in performance during the process of drift detection in data streams. Several drift detectors and ensemble approaches have been widely employed, however they either incur a high cost in terms of memory consumption and run time or ensemble approaches may respond slowly due to using outdated blocks to train classifiers. Motivated by this, we propose a hybrid block-based ensemble, which is a framework for multi-class classification in evolving data streams. The multi-class framework aims to integrate the main pros of an online drift detector for a k-class problem and the concept block-based weighting with a view to react to different types of drifts. The experimental evaluations on well-known synthetic and real-world datasets through a comprehensive comparison upon eleven drift detectors and five ensemble approaches, it shows that our proposed algorithms performs significantly better than other drift detectors and ensemble approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Gama, J., Liobait, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 1–37 (2014)

    Article  Google Scholar 

  2. Alippi, C., Qi, W., Roveri, M.: Learning in nonstationary environments: a hybrid approach. In: International Conference on Artificial Intelligence and Soft Computing, pp. 703–714. Springer, Cham (2017)

  3. Ditzler, G., Roveri, M., Alippi, C., Polikar, R.: Learning in nonstationary environments: a survey. IEEE Comput. Intell. Mag. 10(4), 12–25 (2015)

    Article  Google Scholar 

  4. Geng, Y., Zhang, J. An ensemble classifier algorithm for mining data streams based on concept drift. In: 2017 10th International Symposium on Computational Intelligence and Design (ISCID), vol. 2, pp. 227–230. IEEE (2017)

  5. Loeffel, P.X., Bifet, A., Marsala, C., Detyniecki, M. Droplet ensemble learning on drifting data streams. In: International Symposium on Intelligent Data Analysis, pp. 210–222. Springer, Cham (2017)

  6. Mahdi, O.A., Pardede, E., Cao, J.: Combination of information entropy and ensemble classification for detecting concept drift in data stream. In: Proceedings of the Australasian Computer Science Week Multiconference, pp. 1–5 (2018)

  7. Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Brazilian Symposium on Artificial Intelligence, pp. 286–295. Springer, Berlin, Heidelberg (2004)

  8. Wares, S., Isaacs, J., Elyan, E.: Data stream mining: methods and challenges for handling concept drift. SN Appl. Sci. 1(11), 1–19 (2019)

    Article  Google Scholar 

  9. Abdulsalam, H., Skillicorn, D.B., Martin, P.: Classification using streaming random forests. IEEE Trans. Knowl. Data Eng. 23(1), 22–36 (2010)

    Article  Google Scholar 

  10. Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 443–448. Society for Industrial and Applied Mathematics (2007)

  11. Pesaranghader, A., Viktor, H.L.: Fast hoeffding drift detection method for evolving data streams. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 96–111. Springer, Cham (2016)

  12. Frias-Blanco, I., del Campo-Avila, J., Ramos-Jimenez, G., Morales-Bueno, R., Ortiz-Diaz, A., Caballero-Mota, Y.: Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Trans. Knowledge Data Eng. 27(3), 810–823 (2015)

    Article  Google Scholar 

  13. Baena-Garca, M., del Campo-vila, J., Fidalgo, R., Bifet, A., Gavalda, R., Morales-Bueno, R.: Early drift detection method. In: Fourth International Workshop on Knowledge Discovery from Data Streams, vol. 6, pp. 77–86 (2006)

  14. Nagendran, N., Sultana, H.P., Sarkar, A.: A comparative analysis on ensemble classifiers for concept drifting data streams. In: Soft Computing and Medical Bioinformatics, pp. 55–62. Springer, Singapore (2019)

  15. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification. Int. J. Comput. Intell. Appl. 1, 335–339 (2001)

    Article  Google Scholar 

  16. Dong, F., Lu, J., Zhang, G., Li, K.: Active fuzzy weighting ensemble for dealing with concept drift. Int. J. Comput. Intell. Syst. 11(1), 438–450 (2018)

    Article  Google Scholar 

  17. Gao, J., Fan, W., Han, J., Yu, P.S.: A general framework for mining concept-drifting data streams with skewed distributions. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 3–14. Society for Industrial and Applied Mathematics (2007)

  18. liobait, I.: Learning under concept drift: an overview. arXiv:1010.4784 (2010)

  19. Pesaranghader, A., Viktor, H.L., Paquet, E. McDiarmid drift detection methods for evolving data streams. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–9. IEEE (2018)

  20. Huang, D.T.J., Koh, Y.S., Dobbie, G., Pears, R. Detecting volatility shift in data streams. In: 2014 IEEE International Conference on Data Mining, pp. 863–868. IEEE (2014)

  21. Nishida, K., Yamauchi, K. Detecting concept drift using statistical testing. In International Conference on Discovery Science, pp. 264–269. Springer, Berlin, Heidelberg (2007)

  22. Barros, R.S., Cabral, D.R., Gonalves, P.M., Jr., Santos, S.G.: RDDM: reactive drift detection method. Expert Syst. Appl. 90, 344–355 (2017)

    Article  Google Scholar 

  23. Wang, H., Fan, W., Yu, P.S., Han, J. Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235 (2003)

  24. Sidhu, P., Bhatia, M.P.S.: A novel online ensemble approach to handle concept drifting data streams: diversified dynamic weighted majority. Int. J. Mach. Learn. Cybern. 9(1), 37–61 (2018)

    Article  Google Scholar 

  25. Brzezinski, D., Stefanowski, J.: Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 81–94 (2013)

    Article  Google Scholar 

  26. Nishida, K., Yamauchi, K., Omori, T.: ACE: Adaptive classifiers-ensemble system for concept-drifting environments. In: International Workshop on Multiple Classifier Systems, pp. 176–185. Springer, Berlin, Heidelberg (2005)

  27. Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)

    Article  Google Scholar 

  28. Bifet, A., de Francisci Morales, G., Read, J., Holmes, G. and Pfahringer, B. Efficient online evaluation of big data stream classifiers. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 59–68 (2015)

  29. Liobait, I., Bifet, A., Read, J., Pfahringer, B., Holmes, G.: Evaluation methods and decision theory for classification of streaming data with temporal dependence. Mach. Learn. 98(3), 455–482 (2015)

    Article  MathSciNet  Google Scholar 

  30. Liu, A., Lu, J., Zhang, G.: Diverse instance-weighting ensemble based on region drift disagreement for concept drift adaptation. IEEE Trans. Neural Netw. Learn. Syst. 32(1), 293–307 (2020)

    Article  Google Scholar 

  31. Stoica, I., Song, D., Popa, R.A., Patterson, D., Mahoney, M.W., Katz, R., Joseph, A.D., Jordan, M., Hellerstein, J.M., Gonzalez, J.E., Goldberg, K.: A berkeley view of systems challenges for ai. arXiv:1712.05855 (2017)

  32. Mahdi, O.A., Pardede, E., Ali, N., Cao, J.: Diversity measure as a new drift detection method in data streaming. Knowledge Based Syst. 191, 105227 (2020)

    Article  Google Scholar 

  33. Mahdi, O.A., Pardede, E., Ali, N., Cao, J.: Fast reaction to sudden concept drift in the absence of class labels. Appl. Sci. 10(2), 606 (2020)

    Article  Google Scholar 

  34. Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)

    Google Scholar 

  35. Gama, J., Sebastiao, R., Rodrigues, P.P.: On evaluating stream learning algorithms. Mach. Learn. 90(3), 317–346 (2013)

    Article  MathSciNet  Google Scholar 

  36. Ross, G.J., Adams, N.M., Tasoulis, D.K., Hand, D.J.: Exponentially weighted moving average charts for detecting concept drift. Pattern Recogn. Lett. 33(2), 191–198 (2012)

    Article  Google Scholar 

  37. Abualigah, L.M.Q.: Feature Selection and Enhanced Krill Herd Algorithm for Text Document Clustering, pp. 1–165. Springer, Berlin (2019)

    Book  Google Scholar 

  38. Abualigah, L. and Diabat, A.: A comprehensive survey of the Grasshopper optimization algorithm: results, variants, and applications. Neural Comput. Appl., pp.1-24 (2020)

  39. Abualigah, L.: Group search optimizer: a nature-inspired meta-heuristic optimization algorithm with its results, variants, and applications. Neural Comput. Appl., pp. 1–24 (2020)

  40. Abualigah, L.: Multi-verse optimizer algorithm: a comprehensive survey of its results, variants, and applications. Neural Comput. Appl., pp. 1–21 (2020)

  41. Abualigah, L., Shehab, M., Alshinwan, M., Mirjalili, S. and Abd Elaziz, M.: Ant lion optimizer: a comprehensive survey of its variants and applications. Arch. Comput. Methods Eng., pp. 1–20 (2020)

  42. Li, Z., Huang, W., Xiong, Y., Ren, S., Zhu, T.: Incremental learning imbalanced data streams with concept drift: the dynamic updated ensemble algorithm. Knowledge Based Syst. 195, 105694 (2020)

    Article  Google Scholar 

  43. Liu, A., Lu, J., Zhang, G.: Concept drift detection: dealing with missing values via fuzzy distance estimations. IEEE Trans. Fuzzy Syst. (2020)

  44. Sun, R., Zhang, S., Yin, C., et al.: Strategies for data stream mining method applied in anomaly detection. Cluster Comput. 22, 399–408 (2019)

    Article  Google Scholar 

  45. Yin, C., Zhang, S., Yin, Z., et al.: Anomaly detection model based on data stream clustering. Cluster Comput. 22, 1729–1738 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Osama A. Mahdi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mahdi, O.A., Pardede, E. & Ali, N. A hybrid block-based ensemble framework for the multi-class problem to react to different types of drifts. Cluster Comput 24, 2327–2340 (2021). https://doi.org/10.1007/s10586-021-03267-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-021-03267-7

Keywords

Navigation