Abstract
Concept drift detection is an essential step to maintain the accuracy of online machine learning. The main task is to detect changes in data distribution that might cause changes in the decision boundaries for a classification algorithm. Upon drift detection, the classification algorithm may reset its model or concurrently grow a new learning model. Over the past fifteen years, several drift detection methods have been proposed. Most of these methods have been implemented within the Massive Online Analysis (MOA). Moreover, a couple of studies have compared the drift detectors. However, such studies have merely focused on comparing the detection accuracy. Moreover, most of these studies are focused on synthetic data sets only. Additionally, these studies do not consider drift detectors not integrated into MOA. Furthermore, None of the studies have considered other metrics like resource consumption and runtime characteristics. These metrics are of utmost importance from an operational point of view.
In this paper, we fill this gap. Namely, this paper evaluates the performance of sixteen different drift detection methods using three different metrics: accuracy, runtime, and memory usage. To guarantee a fair comparison, MOA is used. Fourteen algorithms are implemented in MOA. We integrate two new algorithms (ADWIN++ and SDDM) into MOA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Baena-Garcıa, M., del Campo-Ćvila, J., Fidalgo, R., Bifet, A., Gavalda, R., Morales-Bueno, R.: Early drift detection method. In: Fourth International Workshop on Knowledge Discovery from Data Streams, vol. 6, pp. 77ā86 (2006)
de Barros, R.S.M., de Lima Cabral, D.R., GonƧalves Jr, P.M.G., de Carvalho Santos, S.G.T.: RDDM: reactive drift detection method. Expert Syst. Appl. 90, 344ā355 (2017)
Barros, R.S.M., Santos, S.G.T.C.: A large-scale comparison of concept drift detectors. Inf. Sci. 451ā452, 348ā370 (2018)
Bifet, A., GavaldĆ , R.: Learning from time-changing data with adaptive windowing. In: ICDM, pp. 443ā448. SIAM (2007)
Bifet, A., GavaldĆ , R., Holmes, G., Pfahringer, B.: Machine Learning for Data Streams with Practical Examples in MOA. MIT Press, Cambridge (2018)
Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. 11, 1601ā1604 (2010)
BrzeziÅski, D., Stefanowski, J.: Accuracy updated ensemble for data streams with concept drift. In: Corchado, E., KurzyÅski, M., WoÅŗniak, M. (eds.) HAIS 2011. LNCS (LNAI), vol. 6679, pp. 155ā163. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21222-2_19
Domingos, P.M., Hulten, G.: Mining high-speed data streams. In: SIGKDD, pp. 71ā80. ACM (2000)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley (2001)
FrĆas-Blanco, I., del Campo-Ćvila, J., Ramos-JimĆ©nez, G., Morales-Bueno, R., Ortiz-DĆaz, A., Caballero-Mota, Y.: Online and non-parametric drift detection methods based on Hoeffdingās bounds. IEEE Trans. Knowl. Data Eng. 27(3), 810ā823 (2015)
Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286ā295. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29
Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44:1ā44:37 (2014)
GonƧalves, P.M., de Carvalho Santos, S.G., Barros, R.S., Vieira, D.C.: A comparative study on concept drift detectors. Expert Syst. Appl. 41(18), 8144ā8156 (2014)
Grulich, P.M., Saitenmacher, R., Traub, J., BreĆ, S., Rabl, T., Markl, V.: Scalable detection of concept drifts on data streams with parallel adaptive windowing. In: EDBT, pp. 477ā480. OpenProceedings.org (2018)
Han, M., Chen, Z., Li, M., Wu, H., Zhang, X.: A survey of active and passive concept drift handling methods. Comput. Intell. 38(4), 1492ā1535 (2022)
Huang, D.T.J., Koh, Y.S., Dobbie, G., Pears, R.: Detecting volatility shift in data streams, pp. 863ā868 (2014)
Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: a new ensemble method for tracking concept drift. In: ICDM, pp. 123ā130. IEEE (2003)
de Lima Cabral, D.R., de Barros, R.S.M.: Concept drift detection based on Fisherās Exact test. Inf. Sci. 442, 220ā234 (2018)
Liu, G., Cheng, H.R., Qin, Z.G., Liu, Q., Liu, C.X.: E-CVFDT: an improving CVFDT method for concept drift data stream. In: ICCCAS, vol. 1, pp. 315ā318. IEEE (2013)
Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. IEEE TKDE 31(12), 2346ā2363 (2019)
Micevska, S., Awad, A., Sakr, S.: SDDM: an interpretable statistical concept drift detection method for data streams. J. Intell. Inf. Syst. 56(3), 459ā484 (2021). https://doi.org/10.1007/s10844-020-00634-5
Moharram, H., Awad, A., El-Kafrawy, P.M.: Optimizing ADWIN for steady streams. In: ACM/SIGAPP SAC, pp. 450ā459. ACM (2022)
Nishida, K., Yamauchi, K.: Detecting concept drift using statistical testing. In: Corruble, V., Takeda, M., Suzuki, E. (eds.) DS 2007. LNCS (LNAI), vol. 4755, pp. 264ā269. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75488-6_27
Page, E.S.: Continuous inspection schemes. Biometrika 41(1/2), 100ā115 (1954). https://doi.org/10.1093/biomet/41.1-2.100
Pears, R., Sripirakas, S., Koh, Y.S.: Detecting concept change in dynamic data streams. Mach. Learn. 97, 259ā293 (2014). https://doi.org/10.1007/s10994-013-5433-9
Pesaranghader, A., Viktor, H.L., Paquet, E.: McDiarmid drift detection methods for evolving data streams. In: IJCNN, pp. 1ā9. IEEE (2018)
Roberts, S.W.: Control chart tests based on geometric moving averages. Technometrics 1(3), 239ā250 (1959). http://www.jstor.org/stable/1266443
Ross, G.J., Adams, N.M., Tasoulis, D.K., Hand, D.J.: Exponentially weighted moving average charts for detecting concept drift. Pattern Recogn. Lett. 33(2), 191ā198 (2012). https://www.sciencedirect.com/science/article/pii/S0167865511002704
Sakthithasan, S., Pears, R., Koh, Y.S.: One pass concept change detection for data streams. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS (LNAI), vol. 7819, pp. 461ā472. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37456-2_39
Sobolewski, P., Wozniak, M.: Enhancing concept drift detection with simulated recurrence. In: Pechenizkiy, M., Wojciechowski, M. (eds.) New Trends in Databases and Information Systems. AISC, vol. 185, pp. 153ā162. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32518-2_15
Souza, V.M.A., dos Reis, D.M., Maletzke, A.G., Batista, G.E.A.P.A.: Challenges in benchmarking stream learning algorithms with real-world data. Data Min. Knowl. Discov. 34(6), 1805ā1858 (2020). https://doi.org/10.1007/s10618-020-00698-5
Street, W.N., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. In: SIGKDD, pp. 377ā382. ACM (2001)
Wald, A.: Sequential Analysis. Courier Corporation (1973)
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: SIGKDD, pp. 226ā235. ACM (2003)
Wares, S., Isaacs, J., Elyan, E.: Data stream mining: methods and challenges for handling concept drift. SN Appl. Sci. 1(11), 1ā19 (2019). https://doi.org/10.1007/s42452-019-1433-0
Webb, G.I., Lee, L.K., Petitjean, F., Goethals, B.: Understanding concept drift. CoRR abs/1704.00362 (2017)
Acknowledgments
The work of Ahmed Awad is funded by the European Regional Development Funds (Mobilitas Plus Programme grant MOBTT75).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Mahgoub, M., Moharram, H., Elkafrawy, P., Awad, A. (2023). Benchmarking Concept Drift Detectors for Online Machine Learning. In: Fournier-Viger, P., Hassan, A., Bellatreche, L. (eds) Model and Data Engineering. MEDI 2022. Lecture Notes in Computer Science, vol 13761. Springer, Cham. https://doi.org/10.1007/978-3-031-21595-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-21595-7_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21594-0
Online ISBN: 978-3-031-21595-7
eBook Packages: Computer ScienceComputer Science (R0)