Abstract
Two challenges can be found in a life-long classifier that learns continually: the concept drift, when the probability distribution of data is changing in time, and catastrophic forgetting when the earlier learned knowledge is lost. There are many proposed solutions to each challenge, but very little research is done to solve both challenges simultaneously. We show that both the concept drift and catastrophic forgetting are closely related to our proposed description of the life-long continual classification. We describe the process of continual learning as a wrap modification, where a wrap is a manifold that can be trained to cover or uncover a given set of samples. The notion of wraps and their cover/uncover modifiers are theoretical building blocks of a novel general life-long learning scheme, implemented as an ensemble of variational autoencoders. The proposed algorithm is examined on evaluation scenarios for continual learning and compared to state-of-the-art algorithms demonstrating the robustness to catastrophic forgetting and adaptability to concept drift but also showing the new challenges of the life-long classification.









Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Notes
Implementation of the evaluation framework and the ensgendel algorithm is available at https://github.com/comrob/ensgendel.
https://github.com/fikavw/CloGAN
Non-intuitively, the famous Peano space-filling curve shows that there exists a continuous function that maps a continuous curve to a two-dimensional square that can be generalized to the n-dimensional cubes.
References
Achille A, Eccles T, Matthey L, Burgess CP, Watters N, Lerchner A, Higgins I (2018) Life-long disentangled representation learning with cross-domain latent homologies. In: International conference on neural information processing systems, pp 9895–9905
Borghesi A, Bartolini A, Lombardi M, Milano M, Benini L (2019) Anomaly detection using autoencoders in high performance computing systems, pp 9428–9433 . https://doi.org/10.1609/aaai.v33i01.33019428
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):15:1–15:58 . https://doi.org/10.1145/1541880.1541882
Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(10):1517–1531. https://doi.org/10.1109/TNN.2011.2160459
French RM (1999) Catastrophic forgetting in connectionist networks. Trends Cogn Sci 3(4):128–135. https://doi.org/10.1016/S1364-6613(99)01294-2
Ja Gama, Žliobaite I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv. https://doi.org/10.1145/2523813
Gepperth A, Hammer B (2016) Incremental learning algorithms and applications. In: European symposium on artificial neural networks (ESANN), pp 357–368
Hinton GE, McClelland JL, Rumelhart DE (1986) Distributed representations. In: Rumelhart DE, McClelland JL, C. PDP Research Group (eds) Parallel Distributed processing: explorations in the microstructure of cognition, vol 1: foundations. MIT Press, Cambridge, pp 77–109
Kemker R, Kanan C (2018) Fearnet: brain-inspired model for incremental learning. In: International conference on learning representations (ICLR)
Kemker R, McClure M, Abitino A, Hayes TL, Kanan C (2018) Measuring catastrophic forgetting in neural networks. In: McIlraith SA, Weinberger KQ (eds) AAAI conference on artificial intelligence, pp 3390–3398
Kingma DP, Ba J (2015) Adam: A method for stochastic optimization. In: 3rd international conference on learning representations, ICLR, San Diego, CA, USA, May 7–9, conference track proceedings (2015). arxiv: 1412.6980
Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: 2nd International conference on learning representations, ICLR, Banff, AB, Canada, conference track proceedings. arxiv: 1312.6114
Kirkpatrick J et al (2017) Overcoming catastrophic forgetting in neural networks. Proc Nat Acad Sci 114(13):3521–3526. https://doi.org/10.1073/pnas.1611835114
Kramer MA (1991) Nonlinear principal component analysis using autoassociative neural networks. AIChE J 37(2):233–243. https://doi.org/10.1002/aic.690370209
Krawczyk B, Minku LL, Gama J, Stefanowski J, Woźniak M (2017) Ensemble learning for data stream analysis: a survey. Inform Fusion 37:132–156. https://doi.org/10.1016/j.inffus.2017.02.004
Krawczyk B, Woźniak M (2015) One-class classifiers with incremental learning and forgetting for data streams with concept drift. Soft Comput 19(12):3387–3400. https://doi.org/10.1007/s00500-014-1492-5
LeCun Y, Cortes C (2019) MNIST handwritten digit database (2010). http://yann.lecun.com/exdb/mnist/. Cited on 2019-29-01
LeCun YA, Bottou L, Orr GB, Müller KR (2012) Efficient backprop. In: Neural networks: Tricks of the trade. Springer, New York, pp 9–48
Lesort T, Lomonaco V, Stoian A, Maltoni D, Filliat D, Díaz-Rodríguez N (2020) Continual learning for robotics: definition, framework, learning strategies, opportunities and challenges. Inform Fusion 58:52–68. https://doi.org/10.1016/j.inffus.2019.12.004
Marchi E, Vesperini F, Squartini S, Schuller B (2017) Deep recurrent neural network-based autoencoders for acoustic novelty detection. Comput Intell Neurosci 2017. https://doi.org/10.1155/2017/4694860
Marsland S, Shapiro J, Nehmzow U (2002) A self-organising network that grows when required. Neural networks? Off J Int Neural Netw Soc 15(8–9):1041–1058. https://doi.org/10.1016/s0893-6080(02)00078-3
McInnes L, Healy J, Saul N, Großberger L (2018) Umap: uniform manifold approximation and projection. J Open Sour Software 3(29):861
Mustafa AM, Ayoade G, Al-Naami K, Khan L, Hamlen KW, Thuraisingham B, Araujo F (2017) Unsupervised deep embedding for novel class detection over data stream. In: IEEE international conference on Big Data, pp 1830–1839 . https://doi.org/10.1109/BigData.2017.8258127
Nguyen TTT, Nguyen TT, Liew AWC, Wang SL (2018) Variational inference based Bayes online classifiers with concept drift adaptation. Pattern Recogn 81:280–293. https://doi.org/10.1016/j.patcog.2018.04.007
Odena A, Olah C, Shlens J (2017) Conditional image synthesis with auxiliary classifier GANs. In: International conference on machine learning (ICML), pp 2642–2651
Parisi GI, Tani J, Weber C, Wermter S (2018) Lifelong learning of spatiotemporal representations with dual-memory recurrent self-organization. Front Neurorobot 12:78. https://doi.org/10.3389/fnbot.2018.00078
Rios A, Itti L(2019) Closed-loop memory gan for continual learning. In: International joint conference on artificial intelligence (IJCAI), pp 3332–3338 . https://doi.org/10.24963/ijcai.2019/462
Russakovsky O et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115:211–252. https://doi.org/10.1007/s11263-015-0816-y
Shin H, Lee JK, Kim J, Kim, J (2017) Continual learning with deep generative replay. In: Advances in neural information processing systems, pp 2990–2999
Silver DL, Mercer RE (2002) The task rehearsal method of life-long learning: overcoming impoverished data. In: Cohen R, Spencer B (eds) Advances in artificial intelligence. Springer, Berlin, pp 90–101. https://doi.org/10.1007/3-540-47922-8_8
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Bengio Y, LeCun Y (eds) 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, conference track proceedings. arxiv: 1409.1556
Szadkowski R, Drchal J, Faigl J (2019) Basic evaluation scenarios for incrementally trained classifiers. In: International conference on artificial neural networks (ICANN), pp 507–517. https://doi.org/10.1007/978-3-030-30484-3_41
Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30(11):1958–1970. https://doi.org/10.1109/TPAMI.2008.128
Acknowledgements
This work is an extension of the paper presented at the 13th International Workshop on Self-Organizing Maps and Learning Vector Quantization, Clustering and Data Visualization (WSOM 2019), where it received the Best Student Paper award.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported by the Czech Science Foundation (GAČR) under research project No. 18-18858S.
Rights and permissions
About this article
Cite this article
Szadkowski, R., Drchal, J. & Faigl, J. Continually trained life-long classification. Neural Comput & Applic 34, 135–152 (2022). https://doi.org/10.1007/s00521-021-06154-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06154-9