Abstract
Real-world online learning applications often face data coming from changing target functions or distributions. Such changes, called the concept drift, degrade the performance of traditional online learning algorithms. Thus, many existing works focus on detecting concept drift based on statistical evidence. Other works use sliding window or similar mechanisms to select the data that closely reflect current concept. Nevertheless, few works study how the detection and selection techniques can be combined to improve the learning performance. We propose a novel framework on top of existing online learning algorithms to improve the learning performance under concept drifts. The framework detects the possible concept drift by checking whether forgetting some older data may be helpful, and then conduct forgetting through a step called unlearning. The framework effectively results in a dynamic sliding window that selects some data flexibly for different kinds of concept drifts. We design concrete approaches from the framework based on three popular online learning algorithms. Empirical results show that the framework consistently improves those algorithms on ten synthetic data sets and two real-world data sets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Handwritten digits: http://yann.lecun.com/exdb/mnist.
- 2.
Electricity price data: http://www.inescporto.pt/~jgama/ales/ales.html.
References
Baena-García, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavaldà, R., Morales-Bueno, R.: Early drift detection method. In: International Workshop on Knowledge Discovery from Data Streams, pp. 77–86 (2006)
Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: SIAM International Conference on Data Mining, pp. 443–448 (2007)
Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. J. Mach. Learn. Res. 7, 551–585 (2006)
Dredze, M., Crammer, K., Pereira, F.: Confidence-weighted linear classification. In: Proceedings of the 25th International Conference on Machine Learning, pp. 264–271 (2008)
Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004)
Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4), 1–37 (2014)
Harries, M., Wales, N.S.: Splice-2 comparative evaluation: electricity pricing (1999)
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106 (2001)
Crammer, K., Alex Kulesza, M.D.: Adaptive regularization of weight vectors. Mach. Learn. 91, 155–187 (2013)
Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: an ensemble method for drifting concepts. J. Mach. Learn. Res. 8, 2755–2790 (2007)
LeCun, Y., Cortes, C.: MNIST handwritten digit database. AT&T Labs (2010)
Nishida, K., Yamauchi, K.: Detecting concept drift using statistical testing. In: Corruble, V., Takeda, M., Suzuki, E. (eds.) DS 2007. LNCS (LNAI), vol. 4755, pp. 264–269. Springer, Heidelberg (2007)
Ralf, K., Joachims, T.: Detecting concept drift with support vector machines. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 487–494 (2000)
Shalev-Shwartz, S.: Online learning and online convex optimization. Found. Trends Mach. Learn. 4, 107–194 (2012)
Sobhani, P., Beigy, H.: New drift detection method for data streams. Adapt. Intell. Syst. 6943, 88–97 (2011)
Tsymbal, A.: The problem of concept drift: definitions and related work. Technical report, Computer Science Department, Trinity College Dublin (2004)
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23, 69–101 (1996)
You, S.C.: Dynamic unlearning for online learning on concept-drifting data. Masters thesis, National Taiwan University (2015)
Acknowledgment
The work arises from the Master’s thesis of the first author [18]. We thank Profs. Yuh-Jye Lee, Shou-De Lin, the anonymous reviewers and the members of the NTU Computational Learning Lab for valuable suggestions. This work is partially supported by the Ministry of Science and Technology of Taiwan (MOST 103-2221-E-002-148-MY3) and the Asian Office of Aerospace Research and Development (AOARD FA2386-15-1-4012).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
You, SC., Lin, HT. (2016). A Simple Unlearning Framework for Online Learning Under Concept Drifts. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9651. Springer, Cham. https://doi.org/10.1007/978-3-319-31753-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-31753-3_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31752-6
Online ISBN: 978-3-319-31753-3
eBook Packages: Computer ScienceComputer Science (R0)