Abstract
Data streams are often subject to concept drift, which can gradually reduce the reliability of learning models over time in data stream mining. To maintain model accuracy and enhance its robustness, it is crucial to detect concept drift and update the learning model accordingly. The majority of drift detection methods rely on the assumption that true labels are immediately available, which is challenging to implement in real-world scenarios. Therefore, it is more practicable to detect concept drift in an unsupervised manner. This paper proposes an unsupervised Drift Detection method based on Stacked Autoencoder and Page-Hinckley test (DD-SAPH). DD-SAPH employs the stacked autoencoder as a medium to represent the distribution of historical data, which extracts hidden features from the reference window. To measure the difference between distributions of historical data and new data, the reconstruction error of the stacked autoencoder on the current window is employed. The Page-Hinckley test dynamically calculates thresholds to warn and alarm concept drift. Experimental results indicate that DD-SAPH outperforms the compared unsupervised algorithms when addressing concept drift on both synthetic and real datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agrahari, S., Singh, A.K.: Concept drift detection in data stream mining: a literature review. J. King Saud Univ.-Comput. Inform. Sci. (2021)
Ahmadi, Z., Beigy, H.: Semi-supervised ensemble learning of data streams in the presence of concept drift. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, S.-B. (eds.) HAIS 2012. LNCS (LNAI), vol. 7209, pp. 526–537. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28931-6_50
Baena-Garcıa, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavalda, R., Morales-Bueno, R.: Early drift detection method. In: Fourth International Workshop on Knowledge discovery from Data Streams, vol. 6, pp. 77–86 (2006)
Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 443–448. SIAM (2007)
Bifet, A., et al.: Moa: Massive online analysis, a framework for stream classification and clustering. In: Proceedings of the First Workshop on Applications of Pattern Analysis, pp. 44–50. PMLR (2010)
Bodik, P., Hong, W., Guestrin, C., Madden, S., Paskin, M., Thibaux, R.: MIT sensor data. http://db.csail.mit.edu/labdata/labdata.html (2004)
Bu, L., Alippi, C., Zhao, D.: A pdf-free change detection test based on density difference estimation. IEEE Trans. Neural Netw. Learn. Syst. 29(2), 324–334 (2016)
Bu, L., Zhao, D., Alippi, C.: An incremental change detection test based on density difference estimation. IEEE Trans. Syst. Man Cybern. Syst. 47(10), 2714–2726 (2017)
Cerqueira, V., Gomes, H.M., Bifet, A., Torgo, L.: Studd: a student-teacher method for unsupervised concept drift detection. Mach. Learn. 1–28 (2022)
Dasu, T., Krishnan, S., Venkatasubramanian, S., Yi, K.: An information-theoretic approach to detecting changes in multi-dimensional data streams. In: Proceedings of Symposium on the Interface of Statistics, Computing Science, and Applications (Interface) (2006)
Ditzler, G., Polikar, R.: Hellinger distance based drift detection for nonstationary environments. In: 2011 IEEE Symposium on Computational Intelligence in Dynamic and Uncertain Environments (CIDUE), pp. 41–48. IEEE (2011)
Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)
Frias-Blanco, I., del Campo-Ávila, J., Ramos-Jimenez, G., Morales-Bueno, R., Ortiz-Diaz, A., Caballero-Mota, Y.: Online and non-parametric drift detection methods based on hoeffding’s bounds. IEEE Trans. Knowl. Data Eng. 27(3), 810–823 (2014)
Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29
Gözüaçık, Ö., Büyükçakır, A., Bonab, H., Can, F.: Unsupervised concept drift detection with a discriminative classifier. In: Proceedings of the 28th Acm International Conference on Information and Knowledge Management, pp. 2365–2368 (2019)
Haque, A., Khan, L., Baron, M.: Sand: semi-supervised adaptive novel class detection and classification over data stream. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016)
Hopkins, M., Reeber, E., Forman, G., Suermondt, J.: UCI Machine Learning Repository - Spambase Dataset (1999). http://archive.ics.uci.edu/ml/datasets/Spambase
Hosseini, M.J., Gholipour, A., Beigy, H.: An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams. Knowl. Inf. Syst. 46(3), 567–597 (2016)
Kifer, D., Ben-David, S., Gehrke, J.: Detecting change in data streams. In: VLDB, Toronto, Canada, vol. 4, pp. 180–191 (2004)
Liu, A., Lu, J., Liu, F., Zhang, G.: Accumulating regional density dissimilarity for concept drift detection in data streams. Pattern Recogn. 76, 256–272 (2018)
Liu, A., Song, Y., Zhang, G., Lu, J.: Regional concept drift detection and density synchronized drift adaptation. In: IJCAI International Joint Conference on Artificial Intelligence (2017)
Losing, V., Hammer, B., Wersing, H.: Interactive online learning for obstacle classification on a mobile robot. In: 2015 International Joint Conference on Neural Networks (ijcnn), pp. 1–8. IEEE (2015)
Losing, V., Hammer, B., Wersing, H.: Knn classifier with self adjusting memory for heterogeneous concept drift. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 291–300. IEEE (2016)
Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: A review. IEEE Trans. Knowl. Data Eng. 31(12), 2346–2363 (2018)
Lu, N., Zhang, G., Lu, J.: Concept drift detection via competence models. Artif. Intell. 209, 11–28 (2014)
Minku, L.L., White, A.P., Yao, X.: The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans. Knowl. Data Eng. 22(5), 730–742 (2009)
Qahtan, A.A., Alharbi, B., Wang, S., Zhang, X.: A pca-based change detection framework for multidimensional data streams: change detection in multidimensional data streams. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 935–944 (2015)
Raab, C., Heusinger, M., Schleif, F.M.: Reactive soft prototype computing for concept drift streams. Neurocomputing 416, 340–351 (2020)
Ross, G.J., Tasoulis, D.K., Adams, N.M.: Nonparametric monitoring of data streams for changes in location and scale. Technometrics 53(4), 379–389 (2011)
Sethi, T.S., Kantardzic, M.: On the reliable detection of concept drift from streaming unlabeled data. Expert Syst. Appl. 82, 77–99 (2017)
Sobhani, P., Beigy, H.: New drift detection method for data streams. In: Bouchachia, A. (ed.) ICAIS 2011. LNCS (LNAI), vol. 6943, pp. 88–97. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23857-4_12
Tibshirani, R.J., Efron, B.: An introduction to the bootstrap. Monographs Stat. Appli. Probabil. 57, 1–436 (1993)
Acknowledgments
This research was supported by the National Key Research and Development Program of China under Grant No. 2022ZD0115403, National Natural Science Foundation of China under Grant No.62072236, and the Fundamental Research Funds for the Central Universities under Grant NO.56XCA2205404.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhan, S., Li, Y., Liu, C., Zhao, Y. (2024). Unsupervised Concept Drift Detection Based on Stacked Autoencoder and Page-Hinckley Test. In: Jin, H., Yu, Z., Yu, C., Zhou, X., Lu, Z., Song, X. (eds) Green, Pervasive, and Cloud Computing. GPC 2023. Lecture Notes in Computer Science, vol 14503. Springer, Singapore. https://doi.org/10.1007/978-981-99-9893-7_15
Download citation
DOI: https://doi.org/10.1007/978-981-99-9893-7_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-9892-0
Online ISBN: 978-981-99-9893-7
eBook Packages: Computer ScienceComputer Science (R0)