Skip to main content

An Algorithm for Anticipating Future Decision Trees from Concept-Drifting Data

  • Conference paper
Research and Development in Intelligent Systems XXV (SGAI 2008)

Abstract

Concept-Drift is an important topic in practical data mining, since it is reality in most business applications. Whenever a mining model is used in an application it is already outdated since the world has changed since the model induction. The solution is to predict the drift of a model and derive a future model based on such a prediction. One way would be to simulate future data and derive a model from it, but this is typically not feasible. Instead we suggest to predict the values of the measures that drive model induction. In particular, we propose to predict the future values of attribute selection measures and class label distribution for the induction of decision trees. We give an example of how concept drift is reflected in the trend of these measures and that the resulting decision trees perform considerably better than the ones produced by existing approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Akaike, H.: A new look at the statistical model identification. IEEE Transactions on Automatic Control 19(6), 716–723 (1974)

    Article  MATH  MathSciNet  Google Scholar 

  2. Boettcher, M., Nauck, D., Ruta, D., Spott, M.: Towards a framework for change detection in datasets. In: M. Bramer (ed.) Research and Development in Intelligent Systems, Proceedings of AT2006, the 26th SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, vol. 23, pp. 115–128. BCS SGAI, Springer (2006)

    Google Scholar 

  3. Breiman, L., Friedman, J., Olshen, R., Stone, C: Classification and Regression Trees. Wadsworth, Belmont (1984)

    MATH  Google Scholar 

  4. Burnham, K.R, Anderson, D.R.: Multimodel inference: understanding AIC and BIC in model selection. Sociological Methods & Research 33, 261–304 (2004)

    Article  MathSciNet  Google Scholar 

  5. Gill, RE., Murray, W., Wright, M.H.: Practical Optimization. Academic Press, London (1989)

    Google Scholar 

  6. Helmbold, D.P., Long, P.M.: Tracking drifting concepts by minimizing disagreements. Machine Learning 14(1), 27–45 (1994)

    MATH  Google Scholar 

  7. Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989).

    Article  Google Scholar 

  8. Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106. ACM Press, New York, NY, USA (2001).

    Google Scholar 

  9. Hurvich, CM., Tsai, C.L.: Regression and time series model selection in small samples. Biometrika 76, 297–307 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  10. Klinkenberg, R.: Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis 8(3), 281–300 (2004)

    Google Scholar 

  11. Klinkenberg, R., Rueping, S.: Concept drift and the importance of examples. In: J. Franke, G. Nakhaeizadeh, I. Renz (eds.) Text Mining — Theoretical Aspects and Applications, pp. 55–77. Physica-Verlag, Berlin, Germany (2003)

    Google Scholar 

  12. Kuh, A., Petsche, T., Rivest, R.L.: Learning time-varying concepts. In: Advances in Neural Information Processing Systems, pp. 183–189. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1990)

    Google Scholar 

  13. Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1992)

    Google Scholar 

  14. Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1996)

    Google Scholar 

  15. Scharf, L.: Statistical Signal Processing. Addison-Wesley (1991)

    Google Scholar 

  16. Smola, A.J., Schölkopf, B.: A tutorial on support vector regression. Statistics and Computing 14(3), 199–222 (2004)

    Article  MathSciNet  Google Scholar 

  17. Syed, N.A., Liu, H., Sung, K.K.: Handling concept drifts in incremental learning with support vector machines. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 317–321. ACM Press, New York, NY, USA (1999).

    Google Scholar 

  18. Wang, W.: An incremental learning strategy for support vector regression. Neural Processing Letters 21(3), 175–188 (2005).

    Article  Google Scholar 

  19. Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Machine Learning 23(1), 69–101 (1996).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag London Limited

About this paper

Cite this paper

Böttcher, M., Spott, M., Kruse, R. (2009). An Algorithm for Anticipating Future Decision Trees from Concept-Drifting Data. In: Bramer, M., Petridis, M., Coenen, F. (eds) Research and Development in Intelligent Systems XXV. SGAI 2008. Springer, London. https://doi.org/10.1007/978-1-84882-171-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-1-84882-171-2_21

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84882-170-5

  • Online ISBN: 978-1-84882-171-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics