Skip to main content

On the Change of Decision Boundary and Loss in Learning with Concept Drift

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13876))

Abstract

Concept drift, i.e., the change of the data generating distribution, can render machine learning models inaccurate. Many technologies for learning with drift rely on the interleaved test-train error (ITTE) as a quantity to evaluate model performance and trigger drift detection and model updates. Online learning theory mainly focuses on providing generalization bounds for future loss. Usually, these bounds are too loose to be of practical use. Improving them further is not easily possible as they are tight in many cases. In this work, a new theoretical framework focusing on more practical questions is presented: change of training result, optimal models, and ITTE in the presence (and type) of drift. We support our theoretical findings with empirical evidence for several learning algorithms, models, and datasets.

We gratefully acknowledge funding by the BMBF TiM, grant number 05M20PBA.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Agrawal, R., Imielinski, T., Swami, A.N.: Database mining: a performance perspective. IEEE Trans. Knowl. Data Eng. 5, 914–925 (1993)

    Article  Google Scholar 

  2. Asuncion, A., Newman, D.: UCI machine learning repository (2007)

    Google Scholar 

  3. Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Mach. Learn. 79, 151–175 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  4. Blackard, J.A., Dean, D.J., Anderson, C.W.: Covertype data set (1998). http://archive.ics.uci.edu/ml/datasets/Covertype

  5. Ditzler, G., Roveri, M., Alippi, C., Polikar, R.: Learning in nonstationary environments: a survey. IEEE Comp. Int. Mag. 10(4), 12–25 (2015)

    Article  Google Scholar 

  6. Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)

    Article  Google Scholar 

  7. Gama, J.a., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44:1–44:37 (2014)

    Google Scholar 

  8. Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29

    Chapter  Google Scholar 

  9. Gretton, A., Smola, A., Huang, J., Schmittfull, M., Borgwardt, K., Schölkopf, B.: Covariate Shift and Local Learning by Distribution Matching, pp. 131–160. MIT Press, Cambridge, MA, USA (2009)

    Google Scholar 

  10. Hanneke, S.: A bound on the label complexity of agnostic active learning. In: Proceedings of the 24th ICML, pp. 353–360 (2007)

    Google Scholar 

  11. Hanneke, S., Kanade, V., Yang, L.: Learning with a drifting target concept. In: Chaudhuri, K., Gentile, C., Zilles, S. (eds.) ALT 2015. LNCS (LNAI), vol. 9355, pp. 149–164. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24486-0_10

    Chapter  MATH  Google Scholar 

  12. Hanneke, S., Yang, L.: Statistical learning under nonstationary mixing processes. In: The 22nd International Conference on Artificial Intelligence and Statistics, pp. 1678–1686. PMLR (2019)

    Google Scholar 

  13. Harries, M., cse tr, U.N., Wales, N.S.: Splice-2 comparative evaluation: Electricity pricing. Technical report (1999)

    Google Scholar 

  14. Hinder, F., Artelt, A., Hammer, B.: Towards non-parametric drift detection via dynamic adapting window independence drift detection (dawidd). In: ICML (2020)

    Google Scholar 

  15. Hinder, F., Artelt, A., Hammer, B.: A probability theoretic approach to drifting data in continuous time domains. arXiv preprint arXiv:1912.01969 (2019)

  16. Hinder, F., Vaquet, V., Brinkrolf, J., Hammer, B.: On the change of decision boundaries and loss in learning with concept drift. arXiv preprint arXiv:2212.01223 (2022)

  17. Hinder, F., Vaquet, V., Hammer, B.: Suitability of different metric choices for concept drift detection. In: Bouadi, T., Fromont, E., Hüllermeier, E. (eds.) IDA 2022. LNCS, vol. 13205, pp. 157–170. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-01333-1_13

    Chapter  Google Scholar 

  18. Losing, V., Hammer, B., Wersing, H.: Incremental on-line learning: a review and comparison of state of the art algorithms. Neurocomputing 275, 1261–1274 (2018)

    Article  Google Scholar 

  19. Mohri, M., Muñoz Medina, A.: New analysis and algorithm for learning with drifting distributions. In: Bshouty, N.H., Stoltz, G., Vayatis, N., Zeugmann, T. (eds.) ALT 2012. LNCS (LNAI), vol. 7568, pp. 124–138. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34106-9_13

    Chapter  MATH  Google Scholar 

  20. Montiel, J., Read, J., Bifet, A., Abdessalem, T.: Scikit-multiflow: a multi-output streaming framework. J. Mach. Learn. Res. 19(72), 1–5 (2018)

    MathSciNet  Google Scholar 

  21. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  22. Rakhlin, A., Sridharan, K., Tewari, A.: Online learning via sequential complexities. J. Mach. Learn. Res. 16(1), 155–186 (2015)

    MathSciNet  MATH  Google Scholar 

  23. Redko, I., Morvant, E., Habrard, A., Sebban, M., Bennani, Y.: A survey on domain adaptation theory: learning bounds and theoretical guarantees. arXiv preprint arXiv:2004.11829 (2020)

  24. Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, Cambridge (2014)

    Book  MATH  Google Scholar 

  25. Street, W.N., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 26–29 August 2001, pp. 377–382 (2001)

    Google Scholar 

  26. Yang, L.: Active learning with a drifting distribution. In: Advances in Neural Information Processing Systems, vol. 24 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fabian Hinder .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hinder, F., Vaquet, V., Brinkrolf, J., Hammer, B. (2023). On the Change of Decision Boundary and Loss in Learning with Concept Drift. In: Crémilleux, B., Hess, S., Nijssen, S. (eds) Advances in Intelligent Data Analysis XXI. IDA 2023. Lecture Notes in Computer Science, vol 13876. Springer, Cham. https://doi.org/10.1007/978-3-031-30047-9_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-30047-9_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-30046-2

  • Online ISBN: 978-3-031-30047-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics