Abstract
Data stream mining is an important research topic that has received increasing attention due to its use in a wide range of applications, such as sensor networks, banking, and telecommunication. The phenomenon of data streams evolving over time is known as concept drift. In addition, the presence of multiple classes aggravates the problem of a loss in performance during the process of drift detection in data streams. Several drift detectors and ensemble approaches have been widely employed, however they either incur a high cost in terms of memory consumption and run time or ensemble approaches may respond slowly due to using outdated blocks to train classifiers. Motivated by this, we propose a hybrid block-based ensemble, which is a framework for multi-class classification in evolving data streams. The multi-class framework aims to integrate the main pros of an online drift detector for a k-class problem and the concept block-based weighting with a view to react to different types of drifts. The experimental evaluations on well-known synthetic and real-world datasets through a comprehensive comparison upon eleven drift detectors and five ensemble approaches, it shows that our proposed algorithms performs significantly better than other drift detectors and ensemble approaches.
Similar content being viewed by others
References
Gama, J., Liobait, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 1–37 (2014)
Alippi, C., Qi, W., Roveri, M.: Learning in nonstationary environments: a hybrid approach. In: International Conference on Artificial Intelligence and Soft Computing, pp. 703–714. Springer, Cham (2017)
Ditzler, G., Roveri, M., Alippi, C., Polikar, R.: Learning in nonstationary environments: a survey. IEEE Comput. Intell. Mag. 10(4), 12–25 (2015)
Geng, Y., Zhang, J. An ensemble classifier algorithm for mining data streams based on concept drift. In: 2017 10th International Symposium on Computational Intelligence and Design (ISCID), vol. 2, pp. 227–230. IEEE (2017)
Loeffel, P.X., Bifet, A., Marsala, C., Detyniecki, M. Droplet ensemble learning on drifting data streams. In: International Symposium on Intelligent Data Analysis, pp. 210–222. Springer, Cham (2017)
Mahdi, O.A., Pardede, E., Cao, J.: Combination of information entropy and ensemble classification for detecting concept drift in data stream. In: Proceedings of the Australasian Computer Science Week Multiconference, pp. 1–5 (2018)
Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Brazilian Symposium on Artificial Intelligence, pp. 286–295. Springer, Berlin, Heidelberg (2004)
Wares, S., Isaacs, J., Elyan, E.: Data stream mining: methods and challenges for handling concept drift. SN Appl. Sci. 1(11), 1–19 (2019)
Abdulsalam, H., Skillicorn, D.B., Martin, P.: Classification using streaming random forests. IEEE Trans. Knowl. Data Eng. 23(1), 22–36 (2010)
Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 443–448. Society for Industrial and Applied Mathematics (2007)
Pesaranghader, A., Viktor, H.L.: Fast hoeffding drift detection method for evolving data streams. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 96–111. Springer, Cham (2016)
Frias-Blanco, I., del Campo-Avila, J., Ramos-Jimenez, G., Morales-Bueno, R., Ortiz-Diaz, A., Caballero-Mota, Y.: Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Trans. Knowledge Data Eng. 27(3), 810–823 (2015)
Baena-Garca, M., del Campo-vila, J., Fidalgo, R., Bifet, A., Gavalda, R., Morales-Bueno, R.: Early drift detection method. In: Fourth International Workshop on Knowledge Discovery from Data Streams, vol. 6, pp. 77–86 (2006)
Nagendran, N., Sultana, H.P., Sarkar, A.: A comparative analysis on ensemble classifiers for concept drifting data streams. In: Soft Computing and Medical Bioinformatics, pp. 55–62. Springer, Singapore (2019)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification. Int. J. Comput. Intell. Appl. 1, 335–339 (2001)
Dong, F., Lu, J., Zhang, G., Li, K.: Active fuzzy weighting ensemble for dealing with concept drift. Int. J. Comput. Intell. Syst. 11(1), 438–450 (2018)
Gao, J., Fan, W., Han, J., Yu, P.S.: A general framework for mining concept-drifting data streams with skewed distributions. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 3–14. Society for Industrial and Applied Mathematics (2007)
liobait, I.: Learning under concept drift: an overview. arXiv:1010.4784 (2010)
Pesaranghader, A., Viktor, H.L., Paquet, E. McDiarmid drift detection methods for evolving data streams. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–9. IEEE (2018)
Huang, D.T.J., Koh, Y.S., Dobbie, G., Pears, R. Detecting volatility shift in data streams. In: 2014 IEEE International Conference on Data Mining, pp. 863–868. IEEE (2014)
Nishida, K., Yamauchi, K. Detecting concept drift using statistical testing. In International Conference on Discovery Science, pp. 264–269. Springer, Berlin, Heidelberg (2007)
Barros, R.S., Cabral, D.R., Gonalves, P.M., Jr., Santos, S.G.: RDDM: reactive drift detection method. Expert Syst. Appl. 90, 344–355 (2017)
Wang, H., Fan, W., Yu, P.S., Han, J. Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235 (2003)
Sidhu, P., Bhatia, M.P.S.: A novel online ensemble approach to handle concept drifting data streams: diversified dynamic weighted majority. Int. J. Mach. Learn. Cybern. 9(1), 37–61 (2018)
Brzezinski, D., Stefanowski, J.: Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 81–94 (2013)
Nishida, K., Yamauchi, K., Omori, T.: ACE: Adaptive classifiers-ensemble system for concept-drifting environments. In: International Workshop on Multiple Classifier Systems, pp. 176–185. Springer, Berlin, Heidelberg (2005)
Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)
Bifet, A., de Francisci Morales, G., Read, J., Holmes, G. and Pfahringer, B. Efficient online evaluation of big data stream classifiers. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 59–68 (2015)
Liobait, I., Bifet, A., Read, J., Pfahringer, B., Holmes, G.: Evaluation methods and decision theory for classification of streaming data with temporal dependence. Mach. Learn. 98(3), 455–482 (2015)
Liu, A., Lu, J., Zhang, G.: Diverse instance-weighting ensemble based on region drift disagreement for concept drift adaptation. IEEE Trans. Neural Netw. Learn. Syst. 32(1), 293–307 (2020)
Stoica, I., Song, D., Popa, R.A., Patterson, D., Mahoney, M.W., Katz, R., Joseph, A.D., Jordan, M., Hellerstein, J.M., Gonzalez, J.E., Goldberg, K.: A berkeley view of systems challenges for ai. arXiv:1712.05855 (2017)
Mahdi, O.A., Pardede, E., Ali, N., Cao, J.: Diversity measure as a new drift detection method in data streaming. Knowledge Based Syst. 191, 105227 (2020)
Mahdi, O.A., Pardede, E., Ali, N., Cao, J.: Fast reaction to sudden concept drift in the absence of class labels. Appl. Sci. 10(2), 606 (2020)
Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)
Gama, J., Sebastiao, R., Rodrigues, P.P.: On evaluating stream learning algorithms. Mach. Learn. 90(3), 317–346 (2013)
Ross, G.J., Adams, N.M., Tasoulis, D.K., Hand, D.J.: Exponentially weighted moving average charts for detecting concept drift. Pattern Recogn. Lett. 33(2), 191–198 (2012)
Abualigah, L.M.Q.: Feature Selection and Enhanced Krill Herd Algorithm for Text Document Clustering, pp. 1–165. Springer, Berlin (2019)
Abualigah, L. and Diabat, A.: A comprehensive survey of the Grasshopper optimization algorithm: results, variants, and applications. Neural Comput. Appl., pp.1-24 (2020)
Abualigah, L.: Group search optimizer: a nature-inspired meta-heuristic optimization algorithm with its results, variants, and applications. Neural Comput. Appl., pp. 1–24 (2020)
Abualigah, L.: Multi-verse optimizer algorithm: a comprehensive survey of its results, variants, and applications. Neural Comput. Appl., pp. 1–21 (2020)
Abualigah, L., Shehab, M., Alshinwan, M., Mirjalili, S. and Abd Elaziz, M.: Ant lion optimizer: a comprehensive survey of its variants and applications. Arch. Comput. Methods Eng., pp. 1–20 (2020)
Li, Z., Huang, W., Xiong, Y., Ren, S., Zhu, T.: Incremental learning imbalanced data streams with concept drift: the dynamic updated ensemble algorithm. Knowledge Based Syst. 195, 105694 (2020)
Liu, A., Lu, J., Zhang, G.: Concept drift detection: dealing with missing values via fuzzy distance estimations. IEEE Trans. Fuzzy Syst. (2020)
Sun, R., Zhang, S., Yin, C., et al.: Strategies for data stream mining method applied in anomaly detection. Cluster Comput. 22, 399–408 (2019)
Yin, C., Zhang, S., Yin, Z., et al.: Anomaly detection model based on data stream clustering. Cluster Comput. 22, 1729–1738 (2019)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mahdi, O.A., Pardede, E. & Ali, N. A hybrid block-based ensemble framework for the multi-class problem to react to different types of drifts. Cluster Comput 24, 2327–2340 (2021). https://doi.org/10.1007/s10586-021-03267-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-021-03267-7