Skip to main content

A Study on Catastrophic Forgetting in Deep LSTM Networks

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning (ICANN 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11728))

Included in the following conference series:

Abstract

We present a systematic study of Catastrophic Forgetting (CF), i.e., the abrupt loss of previously acquired knowledge, when retraining deep recurrent LSTM networks with new samples. CF has recently received renewed attention in the case of feed-forward DNNs, and this article is the first work that aims to rigorously establish whether deep LSTM networks are afflicted by CF as well, and to what degree. In order to test this fully, training is conducted using a wide variety of high-dimensional image-based sequence classification tasks derived from established visual classification benchmarks (MNIST, Devanagari, FashionMNIST and EMNIST). We find that the CF effect occurs universally, without exception, for deep LSTM-based sequence classifiers, regardless of the construction and provenance of sequences. This leads us to conclude that LSTMs, just like DNNs, are fully affected by CF, and that further research work needs to be conducted in order to determine how to avoid this effect (which is not a goal of this study).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Acharya, S.: Deep Learning Based Large Scale Handwritten Devanagari Character Recognition (2015). https://doi.org/10.31979/etd.3yh5-xs5s

  2. Aljundi, R., Rohrbach, M., Tuytelaars, T.: Selfless Sequential Learning (2018)

    Google Scholar 

  3. Cohen, G., Afshar, S., Tapson, J., Van Schaik, A.: EMNIST: extending MNIST to handwritten letters. In: Proceedings of the International Joint Conference on Neural Networks 2017-May, pp. 2921–2926 (2017). https://doi.org/10.1109/IJCNN.2017.7966217

  4. Coop, R., Arel, I.: Mitigation of catastrophic forgetting in recurrent neural networks using a fixed expansion layer. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–7, August 2013. https://doi.org/10.1109/IJCNN.2013.6707047

  5. Fernando, C., et al.: PathNet: Evolution Channels Gradient Descent in Super Neural Networks (2017)

    Google Scholar 

  6. French, R.: Catastrophic forgetting in connectionist networks. Trends Cogn. Sci. 3(4), 128–135 (1999). https://doi.org/10.1016/S1364-6613(99)01294-2

    Article  Google Scholar 

  7. Sarkar, A., Gepperth, A., Handmann, U., Kopinski, T.: Dynamic hand gesture recognition for mobile systems using deep LSTM. In: Horain, P., Achard, C., Mallem, M. (eds.) IHCI 2017. LNCS, vol. 10688, pp. 19–31. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-72038-8_3

    Chapter  Google Scholar 

  8. Gepperth, A., Hammer, B.: Incremental learning algorithms and applications. In: European Symposium on Artificial Neural Networks (ESANN), pp. 357–368 (April 2016)

    Google Scholar 

  9. Gepperth, A., Karaoguz, C.: A bio-inspired incremental learning architecture for applied perceptual problems. Cogn. Comput. 8(5), 924–934 (2016). https://doi.org/10.1007/s12559-016-9389-5

    Article  Google Scholar 

  10. Goodfellow, I.J., Mirza, M., Xiao, D., Courville, A., Bengio, Y.: An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks (2013). https://doi.org/10.1088/1751-8113/44/8/085201

    Article  MathSciNet  Google Scholar 

  11. Graves, A.: Supervised sequence labelling. In: Supervised Sequence Labelling with Recurrent Neural Networks, vol. 385, pp. 5–13. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24797-2_2

    Google Scholar 

  12. Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: Xing, E.P., Jebara, T. (eds.) Proceedings of the 31st International Conference on Machine Learning. Proceedings of Machine Learning Research, PMLR, Bejing, China, 22–24 June 2014, vol. 32, pp. 1764–1772. http://proceedings.mlr.press/v32/graves14.html

  13. Jaeger, H.: Adaptive nonlinear system identification with echo state networks. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems, vol. 15. pp. 609–616. MIT Press (2003). http://papers.nips.cc/paper/2318-adaptive-nonlinear-system-identification-with-echo-state-networks.pdf

  14. Jia, X., et al.: Incremental dual-memory LSTM in land cover prediction. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017, pp. 867–876. ACM, New York (2017). https://doi.org/10.1145/3097983.3098112, https://doi.org/10.1145/3097983.3098112

  15. Kamra, N., Gupta, U., Liu, Y.: Deep generative dual memory network for continual learning. arXiv preprint arXiv:1710.10368 (2017). http://arxiv.org/abs/1710.10368

  16. Kemker, R., Kanan, C.: FearNet: Brain-Inspired Model for Incremental Learning, pp. 1–16 (2017)

    Google Scholar 

  17. Kemker, R., McClure, M., Abitino, A., Hayes, T., Kanan, C.: Measuring Catastrophic Forgetting in Neural Networks (2017). https://doi.org/10.1073/pnas.1611835114

    Article  MathSciNet  Google Scholar 

  18. Kim, H.-E., Kim, S., Lee, J.: Keep and learn: continual learning by constraining the latent space for knowledge preservation in neural networks. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 520–528. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_59

    Chapter  Google Scholar 

  19. Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks (2016). https://doi.org/10.1073/pnas.1611835114, http://arxiv.org/abs/1612.00796

    Article  MathSciNet  Google Scholar 

  20. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-Based Learning Apllied to Document Recognition (1998). https://doi.org/10.1109/5.726791

    Article  Google Scholar 

  21. Lee, S.W., Kim, J.H., Jun, J., Ha, J.W., Zhang, B.T.: Overcoming Catastrophic Forgetting by Incremental Moment Matching, pp. 4652–4662 (2017). http://papers.nips.cc/paper/7051-overcoming-catastrophic-forgetting-by-incremental-moment-matching.pdf

  22. Lee, S.: Toward continual learning for conversational agents. CoRR abs/1712.09943 (2017). http://arxiv.org/abs/1712.09943

  23. Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 2935–2947 (2018). https://doi.org/10.1109/TPAMI.2017.2773081

    Article  Google Scholar 

  24. McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. Psychol. Learn. Motiv. 24, 109–165 (1989). https://doi.org/10.1016/S0079-7421(08)60536-8. http://www.sciencedirect.com/science/article/pii/S0079742108605368

    Article  Google Scholar 

  25. Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., Wermter, S.: Continual Lifelong Learning with Neural Networks: A Review, pp. 1–29 (2018). https://doi.org/10.1016/j.neunet.2019.01.012

    Article  Google Scholar 

  26. Pfülb, B., Gepperth, A.: A comprehensive, application-oriented study of catastrophic forgetting in DNNs, vol. abs/1905.08101 (2019). http://arxiv.org/abs/1905.08101

  27. Rebuffi, S.a., Kolesnikov, A., Sperl, G., Lampert, C.H.: iCaRL : Incremental Classifier and Representation Learning, pp. 2001–2010 (2017). https://doi.org/10.1109/CVPR.2017.587

  28. Ren, B., Wang, H., Li, J., Gao, H.: Life-long learning based on dynamic combination model. Appl. Soft Comput. J. 56, 398–404 (2017). https://doi.org/10.1016/j.asoc.2017.03.005

    Article  Google Scholar 

  29. Serrà, J., Surís, D., Miron, M., Karatzoglou, A.: Overcoming catastrophic forgetting with hard attention to the task. arXiv preprint arXiv:1801.01423 (2018)

  30. Shin, H., Lee, J.K., Kim, J., Kim, J.: Continual Learning with Deep Generative Replay (NIPS) (2017)

    Google Scholar 

  31. Shmelkov, K., Schmid, C., Alahari, K.: Incremental learning of object detectors without catastrophic forgetting. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3400–3409 (2017). https://doi.org/10.1109/ICCV.2017.368

  32. Srivastava, R.K., Masci, J., Kazerounian, S., Gomez, F., Schmidhuber, J.: Compete to Compute, pp. 2310–2318 (2013). http://papers.nips.cc/paper/5059-compete-to-compute.pdf

  33. Wang, J., Chen, Y., Hao, S., Peng, X., Hu, L.: Deep learning for sensor-based activity recognition: a survey. Pattern Recogn. Lett. (2018). https://doi.org/10.1016/j.patrec.2018.02.010, http://www.sciencedirect.com/science/article/pii/S016786551830045X

    Article  Google Scholar 

  34. Wu, C., Herranz, L., Liu, X., Wang, Y., van de Weijer, J., Raducanu, B.: Memory replay GANs: learning to generate images from new categories without forgetting. arXiv preprint arXiv:1809.02058 (2018). http://dl.acm.org/citation.cfm?id=3327345.3327496

  35. Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. CoRR abs/1708.07747 (2017). http://arxiv.org/abs/1708.07747

  36. Xing, Z., Pei, J., Keogh, E.: A brief survey on sequence classification. ACM SIGKDD Explor. Newsl. 12(1), 40–48 (2010). https://doi.org/10.1145/1882471.1882478

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Monika Schak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Schak, M., Gepperth, A. (2019). A Study on Catastrophic Forgetting in Deep LSTM Networks. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning. ICANN 2019. Lecture Notes in Computer Science(), vol 11728. Springer, Cham. https://doi.org/10.1007/978-3-030-30484-3_56

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30484-3_56

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30483-6

  • Online ISBN: 978-3-030-30484-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics