Abstract
Neural networks encounter serious catastrophic forgetting when information is learned sequentially. Although simply replaying all previous data alleviates the problem, it may require large memory to store all previous training examples. Even with enough memory, joint training can be infeasible if access to past data is limited. We developed generative methods for preventing catastrophic forgetting that do not require the presence of previously used data. Developed methods are based on activation maximization of output neurons and on sampling of posterior probability of data distribution. The methods can work for regular feedforward networks. The proof of concept experiments were performed on publicly available datasets.
The work was supported by Russian Foundation for Basic Research and the government of Ulyanovsk region (Grant No. 18-47-732006).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
McCloskey, M., Cohen, N.-J.: Catastrophic interference in connectionist networks: the sequential learning problem. In: Bower, G.H. (ed.) Psychology of Learning and Motivation, vol. 24, pp. 109–165. Academic Press, San Diego (1989)
Liu, B.: Lifelong machine learning: a paradigm for continuous learning. Front. Comput. Sci. 11(3), 359–361 (2017)
Silver, D.-L., Yang, Q., Li, L.: Lifelong machine learning systems: beyond learning algorithms. In: AAAI Spring Symposium Lifelong Machine Learning, p. 5. AAAI, Stanford (2013)
Choy, M.-C., Srinivasan, D., Cheu, R.-L.: Neural networks for continuous online learning and control. IEEE Trans. Neural Netw. 17(6), 1511–1531 (2006)
Goodfellow, I.-J., Mirza, M., Xiao, D., Courville, A., Bengio, Y.: An empirical investigation of catastrophic forgetting in gradient-based neural networks. arXiv preprint arXiv:1312.6211 (2013)
Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. U.S.A. 114(13), 3521–3526 (2017)
Zenke, F., Poole, B., Ganguli S.: Continual learning through synaptic intelligence. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 3987–3995. PMLR, Sydney (2017)
Lee, S.-W., Kim, J.-H., Jun, J., Ha, J.-W., Zhang, B.-T.: Overcoming catastrophic forgetting by incremental moment matching. In: Advances in Neural Information Processing Systems, pp. 4652–4662. Curran Associates Inc., Long Beach (2017)
Robins, A.: Catastrophic forgetting, rehearsal and pseudorehearsal. Connect. Sci. 7(2), 123–146 (1995)
French, R.-M.: Pseudo-recurrent connectionist networks: an approach to the ‘sensitivity-stability’ dilemma. Connect. Sci. 9(4), 353–380 (1997)
Ans, B., Rousset, S.: Avoiding catastrophic forgetting by coupling two reverberating neural networks. Comptes Rendus de l’Académie des Sciences-Series III-Sciences de la Vie 320(12), 989–997 (1997)
Mocanu, D.-C., Vega, M.-T., Eaton, E., Stone, P., Liotta, A.: Online contrastive divergence with generative replay: Experience replay without storing data. arXiv preprint arXiv:1610.05555 (2016)
Shin, H., Lee, J.-K., Kim, J., Kim, J.: Continual learning with deep generative replay. In: Advances in Neural Information Processing Systems, pp. 2990–2999. Curran Associates Inc., Long Beach (2017)
van de Ven, G.-M., Tolias, A.-S.: Generative replay with feedback connections as a general strategy for continual learning. arXiv preprint arXiv:1809.10635 (2018)
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013)
Erhan, D., Bengio, Y., Courville, A., Vincent, P.: Visualizing higher-layer features of a deep network. Univ. Mont. 1341(3), 1 (2009)
Yosinski, J., Clune, J., Fuchs, T., Lipson, H.: Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579 (2015)
Lewis, J.: Creation by refinement: a creativity paradigm for gradient descent learning networks. In: IEEE 1988 International Conference on Neural Networks, vol. 2, pp. 229–233. IEEE, San Diego (1988)
Welling, M., Teh, Y.-W.: Bayesian learning via stochastic gradient Langevin dynamics. In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 681–688. Omnipress, Bellevue (2011)
Douglass, K.-M., Sukhov, S., Dogariu, A.: Superdiffusion in optically controlled active media. Nat. Photon. 6(12), 834–837 (2012)
Romero, A.-H., Sancho, J.-M.: Brownian motion in short range random potentials. Phys. Rev. E 58(3), 2833 (1998)
Kamaruzaman, A.-F., Zain, A.-M., Yusuf, S.-M., Udin, A.: Levy flight algorithm for optimization problems-a literature review. Appl. Mech. Mater. 421, 496–501 (2013)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Leontev, M., Mikheev, A., Sviatov, K., Sukhov, S. (2019). Overcoming Catastrophic Interference with Bayesian Learning and Stochastic Langevin Dynamics. In: Lu, H., Tang, H., Wang, Z. (eds) Advances in Neural Networks – ISNN 2019. ISNN 2019. Lecture Notes in Computer Science(), vol 11554. Springer, Cham. https://doi.org/10.1007/978-3-030-22796-8_39
Download citation
DOI: https://doi.org/10.1007/978-3-030-22796-8_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22795-1
Online ISBN: 978-3-030-22796-8
eBook Packages: Computer ScienceComputer Science (R0)