Learning with Local Drift Detection

Gama, João; Castillo, Gladys

doi:10.1007/11811305_4

João Gama^22,23 &
Gladys Castillo^22,24

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4093))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

3076 Accesses
48 Citations

Abstract

Most of the work in Machine Learning assume that examples are generated at random according to some stationary probability distribution. In this work we study the problem of learning when the distribution that generates the examples changes over time. We present a method for detection of changes in the probability distribution of examples. The idea behind the drift detection method is to monitor the online error-rate of a learning algorithm looking for significant deviations. The method can be used as a wrapper over any learning algorithm. In most problems, a change affects only some regions of the instance space, not the instance space as a whole. In decision models that fit different functions to regions of the instance space, like Decision Trees and Rule Learners, the method can be used to monitor the error in regions of the instance space, with advantages of fast model adaptation. In this work we present experiments using the method as a wrapper over a decision tree and a linear model, and in each internal-node of a decision tree. The experimental results obtained in controlled experiments using artificial data and a real-world problem show a good performance detecting drift and in adapting the decision model to the new concept.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Basseville, M., Nikiforov, I.: Detection of Abrupt Changes: Theory and Applications. Prentice-Hall, Englewood Cliffs (1987)
Google Scholar
Blake, C., Keogh, E., Merz, C.J.: UCI repository of Machine Learning databases (1999)
Google Scholar
Dawid, A.P.: Statistical theory: The prequential approach. Journal of the Royal Statistical Society-A 147, 278–292 (1984)
Article MATH MathSciNet Google Scholar
Fan, W.: Systematic data selection to mine concept-drifting data streams. In: Gehrke, J., DuMouchel, W. (eds.) Proceedings of the Tenth International Conference on Knowledge Discovery and Data Mining, ACM Press, New York (2004)
Google Scholar
Gama, J., Fernandes, R., Rocha, R.: Decision trees for mining data streams. Intelligent Data Analysis 10(1), 23–46 (2006)
Google Scholar
Gama, J., Medas, P., Rodrigues, P.: Learning decision trees from dynamic data streams. In: Haddad, H., Liebrock, L.M., Omicini, A., Wainwright, R.L. (eds.) Proceedings of the 2005 ACM Symposium on Applied Computing, March 2005, pp. 573–577. ACM Press, New York (2005)
Chapter Google Scholar
Harries, M., Sammut, C., Horn, K.: Extracting hidden context. Machine Learning 32, 101 (1998)
Article MATH Google Scholar
Harries, M.: Splice-2 comparative evaluation: Electricity pricing. Technical report, The University of South Wales (1999)
Google Scholar
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 97–106. ACM Press, New York (2001)
Chapter Google Scholar
Kifer, D., Ben-David, S., Gehrke, J.: Detecting change in data streams. In: VLDB 2004: Proceedings of the 30th International Conference on Very Large Data Bases, Morgan Kaufmann Publishers Inc., San Francisco (2004)
Google Scholar
Klinkenberg, R.: Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis (2004)
Google Scholar
Klinkenberg, R., Joachims, T.: Detecting concept drift with support vector machines. In: Langley, P. (ed.) Proceedings of ICML 2000, 17th International Conference on Machine Learning, Stanford, US, pp. 487–494. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Klinkenberg, R., Renz, I.: Adaptive information filtering: Learning in the presence of concept drifts. In: Learning for Text Categorization, pp. 33–40. AAAI Press, Menlo Park (1998)
Google Scholar
Kolter, J., Maloof, M.: Using additive expert ensembles to cope with concept drift. In: Raedt, L., Wrobel, S. (eds.) Machine Learning, Proceedings of the 22th International Conference, Omni Press (2005)
Google Scholar
Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: A new ensemble method for tracking concept drift. In: Proceedings of the Third International IEEE Conference on Data Mining, pp. 123–130. IEEE Computer Society Press, Los Alamitos (2003)
Chapter Google Scholar
Koychev, I.: Gradual forgetting for adaptation to concept drift. In: Proceedings of ECAI 2000 Workshop Current Issues in Spatio-Temporal Reasoning, Berlin, Germany, pp. 101–106 (2000)
Google Scholar
Koychev, I.: Learning about user in the presence of hidden context. In: Proceedings of Machine Learning for User Modeling: UM-2001 (2001)
Google Scholar
Lanquillon, C.: Enhancing Text Classification to Improve Information Filtering. PhD thesis, University of Madgdeburg, Germany (2001)
Google Scholar
Maloof, M., Michalski, R.: Selecting examples for partial memory learning. Machine Learning 41, 27–52 (2000)
Article Google Scholar
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: Proceedings of the Association for Computing Machinery Sixth International Conference on Knowledge Discovery and Data Mining, pp. 71–80. ACM Press, New York (2000)
Google Scholar
Nick Street, W., Kim, Y.S.: A streaming ensemble algorithm SEA for large-scale classification. In: Proc. 7th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 377–382. ACM Press, New York (2001)
Google Scholar
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Machine Learning 23, 69–101 (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

LIACC, University of Porto, Rua de Ceuta 118-6, 4050, Porto, Portugal
João Gama & Gladys Castillo
Fac. Economics, University of Porto, Portugal
João Gama
University of Aveiro, Portugal
Gladys Castillo

Authors

João Gama
View author publications
You can also search for this author in PubMed Google Scholar
Gladys Castillo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology and Electronic Engineering, The University of Queensland, Queensland, Australia
Xue Li
University of Alberta, Canada
Osmar R. Zaïane
Northwest Polytechnical University, China
Zhanhuai Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gama, J., Castillo, G. (2006). Learning with Local Drift Detection. In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_4

Download citation

DOI: https://doi.org/10.1007/11811305_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37025-3
Online ISBN: 978-3-540-37026-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics