Abstract
Most of the work in Machine Learning assume that examples are generated at random according to some stationary probability distribution. In this work we study the problem of learning when the distribution that generates the examples changes over time. We present a method for detection of changes in the probability distribution of examples. The idea behind the drift detection method is to monitor the online error-rate of a learning algorithm looking for significant deviations. The method can be used as a wrapper over any learning algorithm. In most problems, a change affects only some regions of the instance space, not the instance space as a whole. In decision models that fit different functions to regions of the instance space, like Decision Trees and Rule Learners, the method can be used to monitor the error in regions of the instance space, with advantages of fast model adaptation. In this work we present experiments using the method as a wrapper over a decision tree and a linear model, and in each internal-node of a decision tree. The experimental results obtained in controlled experiments using artificial data and a real-world problem show a good performance detecting drift and in adapting the decision model to the new concept.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Basseville, M., Nikiforov, I.: Detection of Abrupt Changes: Theory and Applications. Prentice-Hall, Englewood Cliffs (1987)
Blake, C., Keogh, E., Merz, C.J.: UCI repository of Machine Learning databases (1999)
Dawid, A.P.: Statistical theory: The prequential approach. Journal of the Royal Statistical Society-A 147, 278–292 (1984)
Fan, W.: Systematic data selection to mine concept-drifting data streams. In: Gehrke, J., DuMouchel, W. (eds.) Proceedings of the Tenth International Conference on Knowledge Discovery and Data Mining, ACM Press, New York (2004)
Gama, J., Fernandes, R., Rocha, R.: Decision trees for mining data streams. Intelligent Data Analysis 10(1), 23–46 (2006)
Gama, J., Medas, P., Rodrigues, P.: Learning decision trees from dynamic data streams. In: Haddad, H., Liebrock, L.M., Omicini, A., Wainwright, R.L. (eds.) Proceedings of the 2005 ACM Symposium on Applied Computing, March 2005, pp. 573–577. ACM Press, New York (2005)
Harries, M., Sammut, C., Horn, K.: Extracting hidden context. Machine Learning 32, 101 (1998)
Harries, M.: Splice-2 comparative evaluation: Electricity pricing. Technical report, The University of South Wales (1999)
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 97–106. ACM Press, New York (2001)
Kifer, D., Ben-David, S., Gehrke, J.: Detecting change in data streams. In: VLDB 2004: Proceedings of the 30th International Conference on Very Large Data Bases, Morgan Kaufmann Publishers Inc., San Francisco (2004)
Klinkenberg, R.: Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis (2004)
Klinkenberg, R., Joachims, T.: Detecting concept drift with support vector machines. In: Langley, P. (ed.) Proceedings of ICML 2000, 17th International Conference on Machine Learning, Stanford, US, pp. 487–494. Morgan Kaufmann, San Francisco (2000)
Klinkenberg, R., Renz, I.: Adaptive information filtering: Learning in the presence of concept drifts. In: Learning for Text Categorization, pp. 33–40. AAAI Press, Menlo Park (1998)
Kolter, J., Maloof, M.: Using additive expert ensembles to cope with concept drift. In: Raedt, L., Wrobel, S. (eds.) Machine Learning, Proceedings of the 22th International Conference, Omni Press (2005)
Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: A new ensemble method for tracking concept drift. In: Proceedings of the Third International IEEE Conference on Data Mining, pp. 123–130. IEEE Computer Society Press, Los Alamitos (2003)
Koychev, I.: Gradual forgetting for adaptation to concept drift. In: Proceedings of ECAI 2000 Workshop Current Issues in Spatio-Temporal Reasoning, Berlin, Germany, pp. 101–106 (2000)
Koychev, I.: Learning about user in the presence of hidden context. In: Proceedings of Machine Learning for User Modeling: UM-2001 (2001)
Lanquillon, C.: Enhancing Text Classification to Improve Information Filtering. PhD thesis, University of Madgdeburg, Germany (2001)
Maloof, M., Michalski, R.: Selecting examples for partial memory learning. Machine Learning 41, 27–52 (2000)
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: Proceedings of the Association for Computing Machinery Sixth International Conference on Knowledge Discovery and Data Mining, pp. 71–80. ACM Press, New York (2000)
Nick Street, W., Kim, Y.S.: A streaming ensemble algorithm SEA for large-scale classification. In: Proc. 7th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 377–382. ACM Press, New York (2001)
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Machine Learning 23, 69–101 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gama, J., Castillo, G. (2006). Learning with Local Drift Detection. In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_4
Download citation
DOI: https://doi.org/10.1007/11811305_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37025-3
Online ISBN: 978-3-540-37026-0
eBook Packages: Computer ScienceComputer Science (R0)