Skip to main content

An Elastic Gradient Boosting Decision Tree for Concept Drift Learning

  • Conference paper
  • First Online:
AI 2020: Advances in Artificial Intelligence (AI 2020)

Abstract

In a non-stationary data stream, concept drift occurs when different chunks of incoming data have different distributions. Hence, over time, the global optimization point of a learning model might permanently drift to the point where the model no longer adequately performs the task it was designed for. This phenomenon needs to be addressed to maintain the integrity and effectiveness of a model over the long term. In this paper, we propose a simple but effective drift learning algorithm called elastic Gradient Boosting Decision Tree (eGBDT). Since the prediction of a GBDT model is the sum output of a list of trees, we can easily append new trees to perform incremental learning or delete the last few trees to roll back to a previously known optimization point. The proposed eGBDT incrementally fits new data and detect drift by searching for the tree with the lowest residual. If the rollback deletions required would exceed the initial number of trees, a retraining process is triggered. Comparisons of eGBDT with five state-of-the-art methods on eight data sets show the efficacy of eGBDT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/kunkun111/AJCAI-eGBDT.

References

  1. Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: SIAM, pp. 443–448. SIAM (2007)

    Google Scholar 

  2. Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)

    Google Scholar 

  3. Bifet, A., Holmes, G., Pfahringer, B.: Leveraging bagging for evolving data streams. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS (LNAI), vol. 6321, pp. 135–150. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15880-3_15

    Chapter  Google Scholar 

  4. Breiman, L.: Bias, variance, and arcing classifiers., Technical report 460, Statistics Department, University of California, Berkeley (1996)

    Google Scholar 

  5. Brzeziński, D., Stefanowski, J.: Accuracy updated ensemble for data streams with concept drift. In: Corchado, E., Kurzyński, M., Woźniak, M. (eds.) HAIS 2011. LNCS (LNAI), vol. 6679, pp. 155–163. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21222-2_19

    Chapter  Google Scholar 

  6. Brzeziński, D., Stefanowski, J.: Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 81–94 (2013)

    Article  Google Scholar 

  7. Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2012)

    Article  Google Scholar 

  8. Ditzler, G., Roveri, M., Alippi, C., Polikar, R.: Learning in nonstationary environments: a survey. IEEE Comput. Intell. Mag. 10(4), 12–25 (2015)

    Article  Google Scholar 

  9. Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)

    Article  Google Scholar 

  10. Feng, J., Xu, Y.X., Jiang, Y., Zhou, Z.H.: Soft gradient boosting machine. arXiv preprint arXiv:2006.04059 (2020)

  11. Frías-Blanco, I., del Campo-Ávila, J., Ramos-Jimenez, G., Morales-Bueno, R., Ortiz-Díaz, A., Caballero-Mota, Y.: Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Trans. Knowl. Data Eng. 27(3), 810–823 (2014)

    Article  Google Scholar 

  12. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)

    Google Scholar 

  13. Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29

    Chapter  Google Scholar 

  14. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44 (2014)

    Article  Google Scholar 

  15. Hu, H., Sun, W., Venkatraman, A., Hebert, M., Bagnell, A.: Gradient boosting on stochastic data streams. In: AISTATS, pp. 595–603. PMLR (2017)

    Google Scholar 

  16. Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: an ensemble method for drifting concepts. J. Mach. Learn. Res. 8, 2755–2790 (2007)

    MATH  Google Scholar 

  17. Krawczyk, B., Minku, L.L., Gama, J., Stefanowski, J., Woźniak, M.: Ensemble learning for data stream analysis: a survey. Inf. Fusion 37, 132–156 (2017)

    Article  Google Scholar 

  18. Liu, A., Lu, J., Liu, F., Zhang, G.: Accumulating regional density dissimilarity for concept drift detection in data streams. Pattern Recogn. 76, 256–272 (2018)

    Article  Google Scholar 

  19. Liu, A., Lu, J., Zhang, G.: Concept drift detection via equal intensity K-means space partitioning. IEEE Trans. Cybern. (2020). https://doi.org/10.1109/TCYB.2020.2983962

    Article  Google Scholar 

  20. Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. IEEE Trans. Knowl. Data Eng. 31(12), 2346–2363 (2018)

    Google Scholar 

  21. Lu, J., Zuo, H., Zhang, G.: Fuzzy multiple-source transfer learning. IEEE Trans. Fuzzy Syst. (2019). https://doi.org/10.1109/TFUZZ.2019.2952792

    Article  Google Scholar 

  22. Lu, N., Lu, J., Zhang, G., De Mantaras, R.L.: A concept drift-tolerant case-base editing technique. Artif. Intell. 230, 108–133 (2016)

    Article  MathSciNet  Google Scholar 

  23. Lu, N., Zhang, G., Lu, J.: Concept drift detection via competence models. Artif. Intell. 209, 11–28 (2014)

    Article  MathSciNet  Google Scholar 

  24. Oza, N.C.: Online bagging and boosting. In: SMC, pp. 2340–2345. IEEE (2005)

    Google Scholar 

  25. Schlimmer, J.C., Granger, R.H.: Incremental learning from noisy data. Mach. Learn. 1(3), 317–354 (1986)

    Google Scholar 

  26. Street, W.N., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. In: ACM SIGKDD, pp. 377–382. ACM (2001)

    Google Scholar 

  27. Sun, Y., Tang, K., Zhu, Z., Yao, X.: Concept drift adaptation by exploiting historical knowledge. IEEE Trans. Neural Netw. Learn. Syst. 29(10), 4822–4832 (2018)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the Australian Research Council through the Discovery Project under Grant DP190101733.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Lu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, K., Liu, A., Lu, J., Zhang, G., Xiong, L. (2020). An Elastic Gradient Boosting Decision Tree for Concept Drift Learning. In: Gallagher, M., Moustafa, N., Lakshika, E. (eds) AI 2020: Advances in Artificial Intelligence. AI 2020. Lecture Notes in Computer Science(), vol 12576. Springer, Cham. https://doi.org/10.1007/978-3-030-64984-5_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-64984-5_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-64983-8

  • Online ISBN: 978-3-030-64984-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics