Abstract
Online classification learners operating under concept drift can be subject to latency in example arrival at the training base. The impact of such latency on the definition of a time stamp is discussed against the background of the online learning life cycle. Data stream latency is modeled in an example life-cycle integrated simulation environment. Two new algorithms are presented: CDTC versions 1 and 2, in which a specific time stamp protocol is used representing the time of classification. Comparison of these algorithms against previous time stamp learning algorithms CD3 and CD5 is made. A time stamp definition and algorithmic solution is presented for handling latency in data streams and improving classification recovery in such affected domains.
Similar content being viewed by others
References
Angelov P, Lughofer E, Zhou X (2008) Evolving fuzzy classifiers with different architectures. Fuzzy Sets Syst 159:3160–3182
Bifet A, Holmes G, Pfahringer B (2010) Leveraging bagging for evolving data streams. ECML/PKDD 2010:135–150
Black M, Hickey RJ (1999) Maintaining the performance of a learned classifier under concept drift. Intell Data Anal 3:453–474
Bouchachia A (2009) Incremental induction of classification fuzzy rules, IEEE Workshop on Evolving and Self-Developing Intelligent Systems (ESDIS) 2009, Nashville, USA, pp 32–39
Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(10):1517–1531
Gama J (2010) Knowledge discovery from data streams. Chapman & Hall/CRC, Boca Raton
Gao J, Fan W and Han J (2007) On appropriate assumptions to mine data streams: analysis and practice. In: Proc. ICDM, 143–152
Hickey RJ (2012) AutoUniv, http://archive.ics.uci.edu/ml/datasets/AutoUniv
Bacardit, J. Krasnogor, N., 2008, “The Infobiotics PSP benchmarks repository”, http://www.infobiotic.net/PSPbenchmarks
Klinkenberg R (2004) Learning drifting concepts: example selection vs. example weighting. Intell Data Anal 8(3):281–300
Kolter JZ, Maloof MA (2007) Dynamic weighted majority: an ensemble method for drifting concepts. J Mach Learn Res 8:2755–2790
Kurlej B, Woźniak M (2011) Learning curve in concept drift while using active learning paradigm. Bouchachia A (ed) ICAIS 2011, LNAI 6943, Springer, Berlin/Heidelberg, pp 98–106
Lughofer E, Angelov P (2011) Handling drifts and shifts in on-line data streams with evolving fuzzy systems. Appl Soft Comput 11:2057–2068
Marrs GR, Hickey RJ, Black MM (2010) Impact of latency on online classification learning with concept drift. In: Proceedings of the 4th International Conference on knowledge science, engineering and management, LNAI, Springer, Berlin, pp 459–469
Marrs GR, Hickey RJ, Black MM (2010) Modeling the example life-cycle in an online classification learner. In: Online Proceedings of the 1st International Workshop on handling concept drift in adaptive information systems: importance, challenges and solutions, HaCDAIS, ECML/PKDD, pp 57–64
Minku LL, White AP and Yao X (2009) The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Transac Knowl Data Eng 99(1):730–742
Pocock A, Yiapanis P, Singer J, Luján M, Brown G (2010) Online non-stationary boosting. In: Multiple classifier systems, LNCS, vol 5997/2010, Springer, Berlin, pp 205–214
Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo
Quinlan R (2003) Data mining tools See5 and C5.0, http://www.rulequest.com
Sobhani P, Beigy H (2011) New drift detection method for data streams. Bouchachia A (ed) ICAIS 2011, LNAI 6943, Springer, Berlin/Heidelberg, pp 88–97
Tsymbal A (2004) The problem of concept drift: definitions and related work. Technical Report TCD-CS-2004-15, Computer Science Department, Trinity College Dublin
Wang H, Yin J, Pei J, Yu P and Yu J (2006) Suppressing model over-fitting in mining concept-drifting data streams. In: Proceedings of the KDD, Philadelphia, pp 736–741
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Marrs, G.R., Black, M.M. & Hickey, R.J. The use of time stamps in handling latency and concept drift in online learning. Evolving Systems 3, 203–220 (2012). https://doi.org/10.1007/s12530-012-9055-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12530-012-9055-4