Abstract
The problem of extracting meaningful patterns from time changing data streams is of increasing importance for the machine learning and data mining communities. We present an algorithm which is able to learn regression trees from fast and unbounded data streams in the presence of concept drifts. To our best knowledge there is no other algorithm for incremental learning regression trees equipped with change detection abilities. The FIRT-DD algorithm has mechanisms for drift detection and model adaptation, which enable to maintain accurate and updated regression models at any time. The drift detection mechanism is based on sequential statistical tests that track the evolution of the local error, at each node of the tree, and inform the learning process for the detected changes. As a response to a local drift, the algorithm is able to adapt the model only locally, avoiding the necessity of a global model adaptation. The adaptation strategy consists of building a new tree whenever a change is suspected in the region and replacing the old ones when the new trees become more accurate. This enables smooth and granular adaptation of the global model. The results from the empirical evaluation performed over several different types of drift show that the algorithm has good capability of consistent detection and proper adaptation to concept drifts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Basseville, M., Nikiforov, I.: Detection of Abrupt Changes: Theory and Applications. Prentice-Hall Inc., Englewood Cliffs (1987)
Ikonomovska, E., Gama, J.: Learning Model Trees from Data Streams. In: Boulicaut, J.-F., Berthold, M.R., Horváth, T. (eds.) DS 2008. LNCS (LNAI), vol. 5255, pp. 5–63. Springer, Heidelberg (2008)
Tsymbal, A.: The problem of concept drift: definitions and related work. Technical Report, TCD-CS-2004-15, Department of Computer Science, Trinity College Dublin, Ireland (2004)
Gama, J., Castillo, G.: Learning with Local Drift Detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004)
Klinkenberg, R.: Learning drifting concepts: Example selection vs. example weighting. J. Intelligent Data Analysis (IDA), Special Issue on Incremental Learning Systems Capable of Dealing with Concept Drift 8(3), 281–300 (2004)
Widmer, G., Kubat, M.: Learning in the presence of concept drifts and hidden contexts. J. Machine Learning 23, 69–101 (1996)
Klinkenberg, R., Joachims, T.: Detecting concept drift with support vector machines. In: Langley, P. (ed.) 17th International Conference on Machine Learning, pp. 487–494. Morgan Kaufmann, San Francisco (2000)
Klinkenberg, R., Renz, I.: Adaptive information filtering: Learning in the presence of concept drifts. In: Learning for Text Categorization, pp. 33–40. AAAI Press, Menlo Park (1998)
Kifer, D., Ben-David, S., Gehrke, J.: Detecting change in data streams. In: 30th International Conference on Very Large Data Bases, pp. 180–191. Morgan Kaufmann, San Francisco (2004)
Gama, J., Fernandes, R., Rocha, R.: Decision trees for mining data streams. J. Intelligent Data Analysis 10(1), 23–46 (2006)
Kolter, J.Z., Maloof, M.: Using additive expert ensembles to cope with concept drift. In: 22nd International Conference on Machine Learning, pp. 449–456. ACM, New York (2005)
Kolter, J.Z., Maloof, M.: Dynamic weighted majority: A new ensemble method for tracking concept drift. In: 3rd International Conference on Data Mining, pp. 123–130. IEEE Computer Society Press, Los Alamitos (2003)
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106. ACM Press, Menlo Park (2001)
Grant, L., Leavenworth, S.: Statistical Quality Control. McGraw-Hill, United States (1996)
Page, E.S.: Continuous Inspection Schemes. J. Biometrika 41, 100–115 (1954)
Mouss, H., Mouss, D., Mouss, N., Sefouhi, L.: Test of Page-Hinkley, an Approach for Fault Detection in an Agro-Alimentary Production System. In: 5th Asian Control Conference, vol. 2, pp. 815–818. IEEE Computer Society Press, Los Alamitos (2004)
Friedman, J.H.: Multivariate Adaptive Regression Splines. J. The Annals of Statistics 19, 1–141 (1991)
ASA Sections on Statistical Computing and Statistical Graphics, Data Expo (2009), http://stat-computing.org/dataexpo/2009/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ikonomovska, E., Gama, J., Sebastião, R., Gjorgjevik, D. (2009). Regression Trees from Data Streams with Drift Detection. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds) Discovery Science. DS 2009. Lecture Notes in Computer Science(), vol 5808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04747-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-04747-3_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04746-6
Online ISBN: 978-3-642-04747-3
eBook Packages: Computer ScienceComputer Science (R0)