Skip to main content

Unsupervised Concept Drift Detection Based on Stacked Autoencoder and Page-Hinckley Test

  • Conference paper
  • First Online:
Green, Pervasive, and Cloud Computing (GPC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14503))

Included in the following conference series:

  • 71 Accesses

Abstract

Data streams are often subject to concept drift, which can gradually reduce the reliability of learning models over time in data stream mining. To maintain model accuracy and enhance its robustness, it is crucial to detect concept drift and update the learning model accordingly. The majority of drift detection methods rely on the assumption that true labels are immediately available, which is challenging to implement in real-world scenarios. Therefore, it is more practicable to detect concept drift in an unsupervised manner. This paper proposes an unsupervised Drift Detection method based on Stacked Autoencoder and Page-Hinckley test (DD-SAPH). DD-SAPH employs the stacked autoencoder as a medium to represent the distribution of historical data, which extracts hidden features from the reference window. To measure the difference between distributions of historical data and new data, the reconstruction error of the stacked autoencoder on the current window is employed. The Page-Hinckley test dynamically calculates thresholds to warn and alarm concept drift. Experimental results indicate that DD-SAPH outperforms the compared unsupervised algorithms when addressing concept drift on both synthetic and real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrahari, S., Singh, A.K.: Concept drift detection in data stream mining: a literature review. J. King Saud Univ.-Comput. Inform. Sci. (2021)

    Google Scholar 

  2. Ahmadi, Z., Beigy, H.: Semi-supervised ensemble learning of data streams in the presence of concept drift. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, S.-B. (eds.) HAIS 2012. LNCS (LNAI), vol. 7209, pp. 526–537. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28931-6_50

    Chapter  Google Scholar 

  3. Baena-Garcıa, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavalda, R., Morales-Bueno, R.: Early drift detection method. In: Fourth International Workshop on Knowledge discovery from Data Streams, vol. 6, pp. 77–86 (2006)

    Google Scholar 

  4. Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 443–448. SIAM (2007)

    Google Scholar 

  5. Bifet, A., et al.: Moa: Massive online analysis, a framework for stream classification and clustering. In: Proceedings of the First Workshop on Applications of Pattern Analysis, pp. 44–50. PMLR (2010)

    Google Scholar 

  6. Bodik, P., Hong, W., Guestrin, C., Madden, S., Paskin, M., Thibaux, R.: MIT sensor data. http://db.csail.mit.edu/labdata/labdata.html (2004)

  7. Bu, L., Alippi, C., Zhao, D.: A pdf-free change detection test based on density difference estimation. IEEE Trans. Neural Netw. Learn. Syst. 29(2), 324–334 (2016)

    Article  MathSciNet  Google Scholar 

  8. Bu, L., Zhao, D., Alippi, C.: An incremental change detection test based on density difference estimation. IEEE Trans. Syst. Man Cybern. Syst. 47(10), 2714–2726 (2017)

    Article  Google Scholar 

  9. Cerqueira, V., Gomes, H.M., Bifet, A., Torgo, L.: Studd: a student-teacher method for unsupervised concept drift detection. Mach. Learn. 1–28 (2022)

    Google Scholar 

  10. Dasu, T., Krishnan, S., Venkatasubramanian, S., Yi, K.: An information-theoretic approach to detecting changes in multi-dimensional data streams. In: Proceedings of Symposium on the Interface of Statistics, Computing Science, and Applications (Interface) (2006)

    Google Scholar 

  11. Ditzler, G., Polikar, R.: Hellinger distance based drift detection for nonstationary environments. In: 2011 IEEE Symposium on Computational Intelligence in Dynamic and Uncertain Environments (CIDUE), pp. 41–48. IEEE (2011)

    Google Scholar 

  12. Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)

    Article  Google Scholar 

  13. Frias-Blanco, I., del Campo-Ávila, J., Ramos-Jimenez, G., Morales-Bueno, R., Ortiz-Diaz, A., Caballero-Mota, Y.: Online and non-parametric drift detection methods based on hoeffding’s bounds. IEEE Trans. Knowl. Data Eng. 27(3), 810–823 (2014)

    Article  Google Scholar 

  14. Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29

    Chapter  Google Scholar 

  15. Gözüaçık, Ö., Büyükçakır, A., Bonab, H., Can, F.: Unsupervised concept drift detection with a discriminative classifier. In: Proceedings of the 28th Acm International Conference on Information and Knowledge Management, pp. 2365–2368 (2019)

    Google Scholar 

  16. Haque, A., Khan, L., Baron, M.: Sand: semi-supervised adaptive novel class detection and classification over data stream. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016)

    Google Scholar 

  17. Hopkins, M., Reeber, E., Forman, G., Suermondt, J.: UCI Machine Learning Repository - Spambase Dataset (1999). http://archive.ics.uci.edu/ml/datasets/Spambase

  18. Hosseini, M.J., Gholipour, A., Beigy, H.: An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams. Knowl. Inf. Syst. 46(3), 567–597 (2016)

    Article  Google Scholar 

  19. Kifer, D., Ben-David, S., Gehrke, J.: Detecting change in data streams. In: VLDB, Toronto, Canada, vol. 4, pp. 180–191 (2004)

    Google Scholar 

  20. Liu, A., Lu, J., Liu, F., Zhang, G.: Accumulating regional density dissimilarity for concept drift detection in data streams. Pattern Recogn. 76, 256–272 (2018)

    Article  Google Scholar 

  21. Liu, A., Song, Y., Zhang, G., Lu, J.: Regional concept drift detection and density synchronized drift adaptation. In: IJCAI International Joint Conference on Artificial Intelligence (2017)

    Google Scholar 

  22. Losing, V., Hammer, B., Wersing, H.: Interactive online learning for obstacle classification on a mobile robot. In: 2015 International Joint Conference on Neural Networks (ijcnn), pp. 1–8. IEEE (2015)

    Google Scholar 

  23. Losing, V., Hammer, B., Wersing, H.: Knn classifier with self adjusting memory for heterogeneous concept drift. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 291–300. IEEE (2016)

    Google Scholar 

  24. Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: A review. IEEE Trans. Knowl. Data Eng. 31(12), 2346–2363 (2018)

    Google Scholar 

  25. Lu, N., Zhang, G., Lu, J.: Concept drift detection via competence models. Artif. Intell. 209, 11–28 (2014)

    Article  MathSciNet  Google Scholar 

  26. Minku, L.L., White, A.P., Yao, X.: The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans. Knowl. Data Eng. 22(5), 730–742 (2009)

    Article  Google Scholar 

  27. Qahtan, A.A., Alharbi, B., Wang, S., Zhang, X.: A pca-based change detection framework for multidimensional data streams: change detection in multidimensional data streams. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 935–944 (2015)

    Google Scholar 

  28. Raab, C., Heusinger, M., Schleif, F.M.: Reactive soft prototype computing for concept drift streams. Neurocomputing 416, 340–351 (2020)

    Article  Google Scholar 

  29. Ross, G.J., Tasoulis, D.K., Adams, N.M.: Nonparametric monitoring of data streams for changes in location and scale. Technometrics 53(4), 379–389 (2011)

    Article  MathSciNet  Google Scholar 

  30. Sethi, T.S., Kantardzic, M.: On the reliable detection of concept drift from streaming unlabeled data. Expert Syst. Appl. 82, 77–99 (2017)

    Article  Google Scholar 

  31. Sobhani, P., Beigy, H.: New drift detection method for data streams. In: Bouchachia, A. (ed.) ICAIS 2011. LNCS (LNAI), vol. 6943, pp. 88–97. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23857-4_12

    Chapter  Google Scholar 

  32. Tibshirani, R.J., Efron, B.: An introduction to the bootstrap. Monographs Stat. Appli. Probabil. 57, 1–436 (1993)

    MathSciNet  Google Scholar 

Download references

Acknowledgments

This research was supported by the National Key Research and Development Program of China under Grant No. 2022ZD0115403, National Natural Science Foundation of China under Grant No.62072236, and the Fundamental Research Funds for the Central Universities under Grant NO.56XCA2205404.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunlong Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhan, S., Li, Y., Liu, C., Zhao, Y. (2024). Unsupervised Concept Drift Detection Based on Stacked Autoencoder and Page-Hinckley Test. In: Jin, H., Yu, Z., Yu, C., Zhou, X., Lu, Z., Song, X. (eds) Green, Pervasive, and Cloud Computing. GPC 2023. Lecture Notes in Computer Science, vol 14503. Springer, Singapore. https://doi.org/10.1007/978-981-99-9893-7_15

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-9893-7_15

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-9892-0

  • Online ISBN: 978-981-99-9893-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics