Simplified Computation and Interpretation of Fisher Matrices in Incremental Learning with Deep Neural Networks

Gepperth, Alexander; Wiech, Florian

doi:10.1007/978-3-030-30484-3_39

Alexander Gepperth¹² &
Florian Wiech¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11728))

Included in the following conference series:

International Conference on Artificial Neural Networks

4174 Accesses

Abstract

Import recent advances in the domain of incremental or continual learning with DNNs, such as Elastic Weight Consolidation (EWC) or Incremental Moment Matching (IMM) rely on a quantity termed the Fisher information matrix (FIM). While the results obtained in this way are very promising, the use of the FIM relies on the assumptions that (a) the FIM can be approximated by its diagonal, and (b) that FIM diagonal entries are related to the variance of a DNN parameter in the context of Bayesian neural networks. In addition, the FIM is notoriously difficult to compute in automatic differentiation (AD) systems frameworks like TensorFlow, and existing implementations require an excessive use of memory due to this problem. We present the Matrix of SQuares (MaSQ), computed similarly as the FIM, but whose use in EWC-like algorithms follows directly from the calculus of derivatives and requires no additional assumptions. Additionally, MaSQ computation in AD frameworks is much simpler and more memory-efficient FIM computation. When using MaSQ together with EWC we show superior or equal performance to FIM/EWC on a variety of benchmark tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

ACGND: towards lower complexity and fast solution for dynamic tensor inversion

Article Open access 31 May 2024

A Stochastic Quasi-Newton Method with Nesterov’s Accelerated Gradient

Physics-Informed Deep Neural Operator Networks

Notes

1.
https://github.com/stokesj/EWC.
2.
www.github.com/EWC.

References

Aljundi, R., Rohrbach, M., Tuytelaars, T.: Selfless sequential learning. In: ICLR (2018)
Google Scholar
Fernando, C., et al.: PathNet: evolution channels gradient descent in super neural networks. arXiv preprint arXiv:1701.08734 (2017)
Forster, O.: Analysis 1, vol. 12. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-642-18491-8
Book Google Scholar
French, R.: Catastrophic forgetting in connectionist networks. Trends Cogn. Sci. 3(4), 128–135 (1999). https://doi.org/10.1016/S1364-6613(99)01294-2
Article Google Scholar
Gepperth, A., Karaoguz, C.: A bio-inspired incremental learning architecture for applied perceptual problems. Cogn. Comput. 8, 924–934 (2015)
Article Google Scholar
Gepperth, A., Hammer, B.: Incremental learning algorithms and applications. In: European Symposium on Artificial Neural Networks (ESANN), pp. 357–368 (2016)
Google Scholar
Goodfellow, I.J., Mirza, M., Xiao, D., Courville, A., Bengio, Y.: An empirical investigation of catastrophic forgetting in gradient-based neural networks. arXiv preprint arXiv:1312.6211 (2013). https://doi.org/10.1088/1751-8113/44/8/085201
Article MathSciNet Google Scholar
Hassibi, B., Stork, D.G., Wolff, G.J.: Optimal brain surgeon and general network pruning. In: IEEE International Conference on Neural Networks, pp. 293–299. IEEE (1993)
Google Scholar
Karnin, E.D.: A simple procedure for pruning back-propagation trained neural networks. IEEE Trans. Neural Networks 1(2), 239–242 (1990)
Article Google Scholar
Kemker, R., Kanan, C.: FearNet: brain-inspired model for incremental learning, pp. 1–16. arXiv preprint arXiv:1711.10563 (2017)
Kemker, R., McClure, M., Abitino, A., Hayes, T.L., Kanan, C.: Measuring catastrophic forgetting in neural networks. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Kim, H.-E., Kim, S., Lee, J.: Keep and learn: continual learning by constraining the latent space for knowledge preservation in neural networks. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 520–528. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_59
Chapter Google Scholar
Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. 114(13), 3521–3526 (2017)
Article MathSciNet Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lee, S.W., Kim, J.H., Jun, J., Ha, J.W., Zhang, B.T.: Overcoming catastrophic forgetting by incremental moment matching. In: NIPS, pp. 1–16 (2017)
Google Scholar
Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., Wermter, S.: Continual lifelong learning with neural networks: a review. Neural Networks 113, 54–71 (2019)
Article Google Scholar
Pascanu, R., Bengio, Y.: Revisiting natural gradient for deep networks. arXiv preprint arXiv:1301.3584 (2013)
Pfülb, B., Gepperth, A.: A comprehensive, application-oriented study of catastrophic forgetting in DNNs. In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Rebuffi, S.A., Kolesnikov, A., Sperl, G., Lampert, C.H.: iCaRL: incremental classifier and representation learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2001–2010 (2017)
Google Scholar
Reed, R.: Pruning algorithms-a survey. IEEE Trans. Neural Networks 4(5), 740–747 (1993)
Article Google Scholar
Ren, B., Wang, H., Li, J., Gao, H.: Life-long learning based on dynamic combination model. Appl. Soft Comput. J. 56, 398–404 (2017). https://doi.org/10.1016/j.asoc.2017.03.005
Article Google Scholar
Serrà, J., Surís, D., Miron, M., Karatzoglou, A.: Overcoming catastrophic forgetting with hard attention to the task. arXiv preprint arXiv:1801.01423 (2018)
Shin, H., Lee, J.K., Kim, J., Kim, J.: Continual learning with deep generative replay. In: NIPS (2017)
Google Scholar
Srivastava, R.K., Masci, J., Kazerounian, S., Gomez, F., Schmidhuber, J.: Compete to compute. In: NIPS, pp. 2310–2318 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Applied Sciences Fulda, 36037, Fulda, Germany
Alexander Gepperth & Florian Wiech

Authors

Alexander Gepperth
View author publications
You can also search for this author in PubMed Google Scholar
Florian Wiech
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander Gepperth .

Editor information

Editors and Affiliations

Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Igor V. Tetko
Institute of Computer Science, Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Pavel Karpov
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Fabian Theis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gepperth, A., Wiech, F. (2019). Simplified Computation and Interpretation of Fisher Matrices in Incremental Learning with Deep Neural Networks. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning. ICANN 2019. Lecture Notes in Computer Science(), vol 11728. Springer, Cham. https://doi.org/10.1007/978-3-030-30484-3_39

Download citation

DOI: https://doi.org/10.1007/978-3-030-30484-3_39
Published: 09 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30483-6
Online ISBN: 978-3-030-30484-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics