A Study on Catastrophic Forgetting in Deep LSTM Networks

Schak, Monika; Gepperth, Alexander

doi:10.1007/978-3-030-30484-3_56

Monika Schak¹² &
Alexander Gepperth¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11728))

Included in the following conference series:

International Conference on Artificial Neural Networks

4689 Accesses
16 Citations

Abstract

We present a systematic study of Catastrophic Forgetting (CF), i.e., the abrupt loss of previously acquired knowledge, when retraining deep recurrent LSTM networks with new samples. CF has recently received renewed attention in the case of feed-forward DNNs, and this article is the first work that aims to rigorously establish whether deep LSTM networks are afflicted by CF as well, and to what degree. In order to test this fully, training is conducted using a wide variety of high-dimensional image-based sequence classification tasks derived from established visual classification benchmarks (MNIST, Devanagari, FashionMNIST and EMNIST). We find that the CF effect occurs universally, without exception, for deep LSTM-based sequence classifiers, regardless of the construction and provenance of sequences. This leads us to conclude that LSTMs, just like DNNs, are fully affected by CF, and that further research work needs to be conducted in order to determine how to avoid this effect (which is not a goal of this study).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

DeeSIL: Deep-Shallow Incremental Learning

Fine-Tuning Deep Neural Networks in Continuous Learning Scenarios

The Research about Recurrent Model-Agnostic Meta Learning

Article 01 January 2020

References

Acharya, S.: Deep Learning Based Large Scale Handwritten Devanagari Character Recognition (2015). https://doi.org/10.31979/etd.3yh5-xs5s
Aljundi, R., Rohrbach, M., Tuytelaars, T.: Selfless Sequential Learning (2018)
Google Scholar
Cohen, G., Afshar, S., Tapson, J., Van Schaik, A.: EMNIST: extending MNIST to handwritten letters. In: Proceedings of the International Joint Conference on Neural Networks 2017-May, pp. 2921–2926 (2017). https://doi.org/10.1109/IJCNN.2017.7966217
Coop, R., Arel, I.: Mitigation of catastrophic forgetting in recurrent neural networks using a fixed expansion layer. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–7, August 2013. https://doi.org/10.1109/IJCNN.2013.6707047
Fernando, C., et al.: PathNet: Evolution Channels Gradient Descent in Super Neural Networks (2017)
Google Scholar
French, R.: Catastrophic forgetting in connectionist networks. Trends Cogn. Sci. 3(4), 128–135 (1999). https://doi.org/10.1016/S1364-6613(99)01294-2
Article Google Scholar
Sarkar, A., Gepperth, A., Handmann, U., Kopinski, T.: Dynamic hand gesture recognition for mobile systems using deep LSTM. In: Horain, P., Achard, C., Mallem, M. (eds.) IHCI 2017. LNCS, vol. 10688, pp. 19–31. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-72038-8_3
Chapter Google Scholar
Gepperth, A., Hammer, B.: Incremental learning algorithms and applications. In: European Symposium on Artificial Neural Networks (ESANN), pp. 357–368 (April 2016)
Google Scholar
Gepperth, A., Karaoguz, C.: A bio-inspired incremental learning architecture for applied perceptual problems. Cogn. Comput. 8(5), 924–934 (2016). https://doi.org/10.1007/s12559-016-9389-5
Article Google Scholar
Goodfellow, I.J., Mirza, M., Xiao, D., Courville, A., Bengio, Y.: An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks (2013). https://doi.org/10.1088/1751-8113/44/8/085201
Article MathSciNet Google Scholar
Graves, A.: Supervised sequence labelling. In: Supervised Sequence Labelling with Recurrent Neural Networks, vol. 385, pp. 5–13. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24797-2_2
Google Scholar
Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: Xing, E.P., Jebara, T. (eds.) Proceedings of the 31st International Conference on Machine Learning. Proceedings of Machine Learning Research, PMLR, Bejing, China, 22–24 June 2014, vol. 32, pp. 1764–1772. http://proceedings.mlr.press/v32/graves14.html
Jaeger, H.: Adaptive nonlinear system identification with echo state networks. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems, vol. 15. pp. 609–616. MIT Press (2003). http://papers.nips.cc/paper/2318-adaptive-nonlinear-system-identification-with-echo-state-networks.pdf
Jia, X., et al.: Incremental dual-memory LSTM in land cover prediction. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017, pp. 867–876. ACM, New York (2017). https://doi.org/10.1145/3097983.3098112, https://doi.org/10.1145/3097983.3098112
Kamra, N., Gupta, U., Liu, Y.: Deep generative dual memory network for continual learning. arXiv preprint arXiv:1710.10368 (2017). http://arxiv.org/abs/1710.10368
Kemker, R., Kanan, C.: FearNet: Brain-Inspired Model for Incremental Learning, pp. 1–16 (2017)
Google Scholar
Kemker, R., McClure, M., Abitino, A., Hayes, T., Kanan, C.: Measuring Catastrophic Forgetting in Neural Networks (2017). https://doi.org/10.1073/pnas.1611835114
Article MathSciNet Google Scholar
Kim, H.-E., Kim, S., Lee, J.: Keep and learn: continual learning by constraining the latent space for knowledge preservation in neural networks. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 520–528. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_59
Chapter Google Scholar
Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks (2016). https://doi.org/10.1073/pnas.1611835114, http://arxiv.org/abs/1612.00796
Article MathSciNet Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-Based Learning Apllied to Document Recognition (1998). https://doi.org/10.1109/5.726791
Article Google Scholar
Lee, S.W., Kim, J.H., Jun, J., Ha, J.W., Zhang, B.T.: Overcoming Catastrophic Forgetting by Incremental Moment Matching, pp. 4652–4662 (2017). http://papers.nips.cc/paper/7051-overcoming-catastrophic-forgetting-by-incremental-moment-matching.pdf
Lee, S.: Toward continual learning for conversational agents. CoRR abs/1712.09943 (2017). http://arxiv.org/abs/1712.09943
Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 2935–2947 (2018). https://doi.org/10.1109/TPAMI.2017.2773081
Article Google Scholar
McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. Psychol. Learn. Motiv. 24, 109–165 (1989). https://doi.org/10.1016/S0079-7421(08)60536-8. http://www.sciencedirect.com/science/article/pii/S0079742108605368
Article Google Scholar
Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., Wermter, S.: Continual Lifelong Learning with Neural Networks: A Review, pp. 1–29 (2018). https://doi.org/10.1016/j.neunet.2019.01.012
Article Google Scholar
Pfülb, B., Gepperth, A.: A comprehensive, application-oriented study of catastrophic forgetting in DNNs, vol. abs/1905.08101 (2019). http://arxiv.org/abs/1905.08101
Rebuffi, S.a., Kolesnikov, A., Sperl, G., Lampert, C.H.: iCaRL : Incremental Classifier and Representation Learning, pp. 2001–2010 (2017). https://doi.org/10.1109/CVPR.2017.587
Ren, B., Wang, H., Li, J., Gao, H.: Life-long learning based on dynamic combination model. Appl. Soft Comput. J. 56, 398–404 (2017). https://doi.org/10.1016/j.asoc.2017.03.005
Article Google Scholar
Serrà, J., Surís, D., Miron, M., Karatzoglou, A.: Overcoming catastrophic forgetting with hard attention to the task. arXiv preprint arXiv:1801.01423 (2018)
Shin, H., Lee, J.K., Kim, J., Kim, J.: Continual Learning with Deep Generative Replay (NIPS) (2017)
Google Scholar
Shmelkov, K., Schmid, C., Alahari, K.: Incremental learning of object detectors without catastrophic forgetting. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3400–3409 (2017). https://doi.org/10.1109/ICCV.2017.368
Srivastava, R.K., Masci, J., Kazerounian, S., Gomez, F., Schmidhuber, J.: Compete to Compute, pp. 2310–2318 (2013). http://papers.nips.cc/paper/5059-compete-to-compute.pdf
Wang, J., Chen, Y., Hao, S., Peng, X., Hu, L.: Deep learning for sensor-based activity recognition: a survey. Pattern Recogn. Lett. (2018). https://doi.org/10.1016/j.patrec.2018.02.010, http://www.sciencedirect.com/science/article/pii/S016786551830045X
Article Google Scholar
Wu, C., Herranz, L., Liu, X., Wang, Y., van de Weijer, J., Raducanu, B.: Memory replay GANs: learning to generate images from new categories without forgetting. arXiv preprint arXiv:1809.02058 (2018). http://dl.acm.org/citation.cfm?id=3327345.3327496
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. CoRR abs/1708.07747 (2017). http://arxiv.org/abs/1708.07747
Xing, Z., Pei, J., Keogh, E.: A brief survey on sequence classification. ACM SIGKDD Explor. Newsl. 12(1), 40–48 (2010). https://doi.org/10.1145/1882471.1882478
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Applied Sciences Fulda, 36037, Fulda, Germany
Monika Schak & Alexander Gepperth

Authors

Monika Schak
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Gepperth
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Monika Schak .

Editor information

Editors and Affiliations

Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Igor V. Tetko
Institute of Computer Science, Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Pavel Karpov
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Fabian Theis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schak, M., Gepperth, A. (2019). A Study on Catastrophic Forgetting in Deep LSTM Networks. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning. ICANN 2019. Lecture Notes in Computer Science(), vol 11728. Springer, Cham. https://doi.org/10.1007/978-3-030-30484-3_56

Download citation

DOI: https://doi.org/10.1007/978-3-030-30484-3_56
Published: 09 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30483-6
Online ISBN: 978-3-030-30484-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics