skip to main content
research-article

Concept Drift Adaptation by Exploiting Drift Type

Published: 12 February 2024 Publication History

Abstract

Concept drift is a phenomenon where the distribution of data streams changes over time. When this happens, model predictions become less accurate. Hence, models built in the past need to be re-learned for the current data. Two design questions need to be addressed in designing a strategy to re-learn models: which type of concept drift has occurred, and how to utilize the drift type to improve re-learning performance. Existing drift detection methods are often good at determining when drift has occurred. However, few retrieve information about how the drift came to be present in the stream. Hence, determining the impact of the type of drift on adaptation is difficult. Filling this gap, we designed a framework based on a lazy strategy called Type-Driven Lazy Drift Adaptor (Type-LDA). Type-LDA first retrieves information about both how and when a drift has occurred, then it uses this information to re-learn the new model. To identify the type of drift, a drift type identifier is pre-trained on synthetic data of known drift types. Furthermore, a drift point locator locates the optimal point of drift via a sharing loss. Hence, Type-LDA can select the optimal point, according to the drift type, to re-learn the new model. Experiments validate Type-LDA on both synthetic data and real-world data, and the results show that accurately identifying drift type can improve adaptation accuracy.

References

[1]
Ahmad Abbasi, Abdul Rehman Javed, Chinmay Chakraborty, Jamel Nebhen, Wisha Zehra, and Zunera Jalil. 2021. ElStream: An ensemble learning approach for concept drift detection in dynamic social big data stream learning. IEEE Access 9 (2021), 66408–66419.
[2]
Supriya Agrahari and Anil Kumar Singh. 2021. Concept drift detection in data stream mining: A literature review. Journal of King Saud University-Computer and Information Sciences 34, 10 (2022), 9523–9540.
[3]
Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. 1993. Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering 5, 6 (1993), 914–925.
[4]
M. Baena-Garc, J. D. Campo-Ávila, R. Fidalgo, A. Bifet, and R. Morales-Bueno. 2006. Early drift detection method. In Proceedings of the International Workshop on Knowledge Discovery from Data Streams.
[5]
Manuel Baena-Garcıa, José del Campo-Ávila, Raúl Fidalgo, Albert Bifet, R. Gavalda, and Rafael Morales-Bueno. 2006. Early drift detection method. In Proceedings of the 4th International Workshop on Knowledge Discovery from Data Streams. Vol. 6, 77–86.
[6]
Albert Bifet and Ricard Gavalda. 2007. Learning from time-changing data with adaptive windowing. In Proceedings of the 2007 SIAM International Conference on Data Mining. SIAM, 443–448.
[7]
Alberto Cano and Bartosz Krawczyk. 2020. Kappa updated ensemble for drifting data stream mining. Machine Learning 109, 1 (2020), 175–218.
[8]
Ryan Elwell and Robi Polikar. 2011. Incremental learning of concept drift in nonstationary environments. IEEE Transactions on Neural Networks 22, 10 (2011), 1517–1531.
[9]
Isvani Frias-Blanco, Jose Del Campo-Avila, Gonzalo Ramos-Jimenez, Rafael Morales-Bueno, Agustin Ortiz-Diaz, and Yaile Caballero-Mota. 2015. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Transactions on Knowledge and Data Engineering 27, 3 (2015), 810–823.
[10]
Isvani Frias-Blanco, José del Campo-Ávila, Gonzalo Ramos-Jimenez, Rafael Morales-Bueno, Agustin Ortiz-Diaz, and Yaile Caballero-Mota. 2014. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Transactions on Knowledge and Data Engineering 27, 3 (2014), 810–823.
[11]
Joao Gama, Pedro Medas, Gladys Castillo, and Pedro Rodrigues. 2004. Learning with drift detection. In Proceedings of the Brazilian Symposium on Artificial Intelligence. Springer, 286–295.
[12]
J. Gama, P. Medas, G. Castillo, and P. P. Rodrigues. 2004. Learning with drift detection. In Brazilian Symposium on Artificial Intelligence. Springer, 286–295.
[13]
João Gama, Indrė Žliobaitė, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. 2014. A survey on concept drift adaptation. ACM Computing Surveys 46, 4 (2014), 1–37.
[14]
Husheng Guo, Hai Li, Qiaoyan Ren, and Wenjian Wang. 2022. Concept drift type identification based on multi-sliding windows. Information Sciences 585 (2022), 1–23.
[15]
Ben Halstead, Yun Sing Koh, Patricia Riddle, Russel Pears, Mykola Pechenizkiy, Albert Bifet, Gustavo Olivares, and Guy Coulson. 2022. Analyzing and repairing concept drift adaptation in data stream classification. Machine Learning 111, 10 (2022), 3489–3523.
[16]
Geoff Hulten, Laurie Spencer, and Pedro Domingos. 2001. Mining time-changing data streams. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 97–106.
[17]
Dino Ienco, Albert Bifet, Indrė Žliobaitė, and Bernhard Pfahringer. 2013. Clustering based active learning for evolving data streams. In Proceedings of the International Conference on Discovery Science. Springer, 79–93.
[18]
Alex Kendall, Yarin Gal, and Roberto Cipolla. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7482–7491.
[19]
D. Kingma and J. Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[20]
Bartosz Krawczyk. 2017. Active and adaptive ensemble learning for online activity recognition from data streams. Knowledge-Based Systems 138 (2017), 69–78.
[21]
Anjin Liu, Jie Lu, Yiliao Song, Junyu Xuan, and Guangquan Zhang. 2022. Concept drift detection delay index. IEEE Transactions on Knowledge and Data Engineering 35, 5 (2022), 4585–4597.
[22]
Anjin Liu, Guangquan Zhang, and Jie Lu. 2017. Fuzzy time windowing for gradual concept drift adaptation. In Proceedings of the 2017 IEEE International Conference on Fuzzy Systems. IEEE, 1–6.
[23]
J. Lu, A. Liu, F. Dong, F. Gu, J. Gama, and G. Zhang. 2020. Learning under concept drift: A review. IEEE Transactions on Knowledge and Data Engineering 31, 12 (2018), 2346–2363, 2018, 1–1.
[24]
Yang Lu, Yiu-Ming Cheung, and Yuan Yan Tang. 2019. Adaptive chunk-based dynamic weighted majority for imbalanced data streams with concept drift. IEEE Transactions on Neural Networks and Learning Systems 31, 8 (2019), 2764–2778.
[25]
Kyosuke Nishida and Koichiro Yamauchi. 2007. Detecting concept drift using statistical testing. In Proceedings of the 10th International Conference on Discovery Science.
[26]
Gustavo Oliveira, Leandro L. Minku, and Adriano L. I. Oliveira. 2021. Tackling virtual and real concept drifts: An adaptive Gaussian mixture model approach. IEEE Transactions on Knowledge and Data Engineering 1 (2021), 1–1.
[27]
Nikunj C. Oza and Stuart Russell. 2001. Experimental comparisons of online and batch versions of bagging and boosting. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 359–364.
[28]
Sankar K. Pal and Sushmita Mitra. 1992. Multilayer perceptron, fuzzy sets, classifiaction. IEEE Transactionson Neural Networks 3, (1992), 683–697.
[29]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in python. The Journal of Machine Learning Research 12 (2011), 2825–2830.
[30]
Mahardhika Pratama, Jie Lu, Edwin Lughofer, Guangquan Zhang, and Meng Joo Er. 2016. An incremental learning of concept drifts using evolving type-2 recurrent fuzzy neural networks. IEEE Transactions on Fuzzy Systems 25, 5 (2016), 1175–1192.
[31]
C. Raab, M. Heusinger, and F. M. Schleif. 2020. Reactive soft prototype computing for concept drift streams. Neurocomputing 416, (2020), 340–351.
[32]
Md Geaur Rahman and Md Zahidul Islam. 2022. Adaptive decision forest: An incremental machine learning framework. Pattern Recognition 122 (2022), 108345.
[33]
David Saad. 1998. Online algorithms and stochastic approximations. Online Learning 5 (1998), 6–3.
[34]
Jicheng Shan, Hang Zhang, Weike Liu, and Qingbao Liu. 2018. Online active learning ensemble framework for drifted data streams. IEEE Transactions on Neural Networks and Learning Systems 30, 2 (2018), 486–498.
[35]
Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems.
[36]
Yiliao Song, Jie Lu, Haiyan Lu, and Guangquan Zhang. 2021. Learning data streams with changing distributions and temporal dependency. IEEE Transactions on Neural Networks and Learning Systems (2021).
[37]
Jan N. van Rijn, Geoffrey Holmes, Bernhard Pfahringer, and Joaquin Vanschoren. 2018. The online performance estimation framework: Heterogeneous ensemble learning for data streams. Machine Learning 107, 1 (2018), 149–176.
[38]
Kun Wang, Jie Lu, Anjin Liu, Yiliao Song, Li Xiong, and Guangquan Zhang. 2022. Elastic gradient boosting decision tree with adaptive iterations for concept drift adaptation. Neurocomputing 491 (2022), 288–304.
[39]
Mingyuan Wang and Adrian Barbu. 2022. Online feature screening for data streams with concept drift. IEEE Transactions on Knowledge and Data Engineering 1 (2022), 1–14.
[40]
Dongrui Wu, Chin-Teng Lin, and Jian Huang. 2019. Active learning for regression using greedy sampling. Information Sciences 474 (2019), 90–105.
[41]
Shuliang Xu and Junhong Wang. 2017. Dynamic extreme learning machine for data stream classification. Neurocomputing 238 (2017), 433–449.
[42]
Junyu Xuan, Jie Lu, and Guangquan Zhang. 2020. Bayesian nonparametric unsupervised concept drift detection for data stream mining. ACM Transactions on Intelligent Systems and Technology 12, 1 (2020), 1–22.
[43]
En Yu, Yiliao Song, Guangquan Zhang, and Jie Lu. 2022. Learn-to-adapt: Concept drift adaptation for hybrid multiple streams. Neurocomputing 496 (2022), 121–130.
[44]
N. Lu, G. Zhang, and J. Lu. 2014. Concept drift detection via competence models. Artificial Intelligence, 209 (2014), 11–18.
[45]
Hang Yu, Weixu Liu, Jie Lu, Yimin Wen, Xiangfeng Luo, and Guangquan Zhang. 2023. Detecting group concept drift from multiple data streams. Pattern Recognition 134 (2023), 109113.
[46]
Hang Yu, Jie Lu, Anjin Liu, Bin Wang, Ruimin Li, and Guangquan Zhang. 2022. Real-time prediction system of train carriage load based on multi-stream fuzzy learning. IEEE Transactions on Intelligent Transportation Systems 23, 9 (2022), 15155–15165.
[47]
Hang Yu, Jie Lu, and Guangquan Zhang. 2020. Continuous support vector regression for nonstationary streaming data. IEEE Transactions on Cybernetics 52, 5 (2020), 3592–3605.
[48]
H. Yu, J. Lu, and G. Zhang. 2020. An online robust support vector regression for data streams. IEEE Transactions on Knowledge and Data Engineering PP, 99 (2020), 1–1.
[49]
Hang Yu, Jie Lu, and Guangquan Zhang. 2020. An online robust support vector regression for data streams. IEEE Transactions on Knowledge and Data Engineering 34, 1 (2020), 150–163.
[50]
Hang Yu, Jie Lu, and Guangquan Zhang. 2020. Topology learning-based fuzzy random neural networks for streaming data regression. IEEE Transactions on Fuzzy Systems 30, 2 (2020), 412–425.
[51]
Hang Yu, Jie Lu, and Guangquan Zhang. 2021. MORStreaming: A multioutput regression system for streaming data. IEEE Transactions on Systems, Man, and Cybernetics: Systems 52, 8 (2021), 4862–4874.
[52]
Hang Yu, Qingyong Zhang, Tianyu Liu, Jie Lu, Yimin Wen, and Guangquan Zhang. 2022. Meta-ADD: A meta-learning based pre-trained model for concept drift active detection. Information Sciences 608 (2022), 996–1009.
[53]
Xiulin Zheng, Peipei Li, Xuegang Hu, and Kui Yu. 2021. Semi-supervised classification on data streams with recurring concept drift and concept evolution. Knowledge-Based Systems 215 (2021), 106749.

Cited By

View all
  • (2025)A novel drift detection method using parallel detection and anti-noise techniquesApplied Intelligence10.1007/s10489-024-05988-955:6Online publication date: 10-Feb-2025
  • (2024)Detection of Malicious Domains With Concept Drift Using Ensemble LearningIEEE Transactions on Network and Service Management10.1109/TNSM.2024.343551621:6(6796-6809)Online publication date: Dec-2024
  • (2024)IN-GFD: An Interpretable Graph Fraud Detection Model for Spam ReviewsIEEE Transactions on Artificial Intelligence10.1109/TAI.2024.34202625:10(5325-5339)Online publication date: Oct-2024
  • Show More Cited By

Index Terms

  1. Concept Drift Adaptation by Exploiting Drift Type

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Knowledge Discovery from Data
    ACM Transactions on Knowledge Discovery from Data  Volume 18, Issue 4
    May 2024
    707 pages
    EISSN:1556-472X
    DOI:10.1145/3613622
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 February 2024
    Online AM: 02 January 2024
    Accepted: 15 December 2023
    Revised: 28 September 2023
    Received: 30 November 2022
    Published in TKDD Volume 18, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Concept drift
    2. data streams
    3. drift detection
    4. drift adaptation

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China Youth Fund
    • Shanghai Committee of Science and Technology, China
    • Shanghai Yangfan Program

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)801
    • Downloads (Last 6 weeks)77
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)A novel drift detection method using parallel detection and anti-noise techniquesApplied Intelligence10.1007/s10489-024-05988-955:6Online publication date: 10-Feb-2025
    • (2024)Detection of Malicious Domains With Concept Drift Using Ensemble LearningIEEE Transactions on Network and Service Management10.1109/TNSM.2024.343551621:6(6796-6809)Online publication date: Dec-2024
    • (2024)IN-GFD: An Interpretable Graph Fraud Detection Model for Spam ReviewsIEEE Transactions on Artificial Intelligence10.1109/TAI.2024.34202625:10(5325-5339)Online publication date: Oct-2024
    • (2024)A drift detection method for industrial images based on a defect segmentation modelKnowledge-Based Systems10.1016/j.knosys.2024.112320301:COnline publication date: 9-Oct-2024

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media