Abstract
Replay strategies are Continual Learning techniques which mitigate catastrophic forgetting by keeping a buffer of patterns from previous experiences, which are interleaved with new data during training. The amount of patterns stored in the buffer is a critical parameter which largely influences the final performance and the memory footprint of the approach. This work introduces Distilled Replay, a novel replay strategy for Continual Learning which is able to mitigate forgetting by keeping a very small buffer (1 pattern per class) of highly informative samples. Distilled Replay builds the buffer through a distillation process which compresses a large dataset into a tiny set of informative examples. We show the effectiveness of our Distilled Replay against popular replay-based strategies on four Continual Learning benchmarks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The code along with the configuration files needed to reproduce our results are available at https://github.com/andrearosasco/DistilledReplay.
References
Aljundi, R., et al.: Online continual learning with maximal interfered retrieval. Adv. Neural Inf. Process. Syst. 32, 11849–11860 (2019)
Asghar, N., Mou, L., Selby, K.A., Pantasdo, K.D., Poupart, P., Jiang, X.: Progressive memory banks for incremental domain adaptation. In: International Conference on Learning Representations (2019)
Chaudhry, A., Dokania, P.K., Ajanthan, T., Torr, P.H.S.: Riemannian walk for incremental learning: understanding forgetting and intransigence. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 556–572. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_33
Chaudhry, A., Ranzato, M., Rohrbach, M., Elhoseiny, M.: Efficient lifelong learning with A-GEM. In: ICLR (2019)
Chaudhry, A., et al.: On Tiny Episodic Memories in Continual Learning. arXiv (2019)
Cossu, A., Carta, A., Bacciu, D.: Continual learning with gated incremental memories for sequential data processing. In: Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN 2020) (2020). https://doi.org/10.1109/ijcnn48605.2020.9207550
Farquhar, S., Gal, Y.: Towards robust evaluations of continual learning. In: Privacy in Machine Learning and Artificial Intelligence Workshop, ICML (2019)
French, R.M.: Catastrophic forgetting in connectionist networks. Trends Cogn. Sci. 3, 128–135 (1999). https://doi.org/10.1016/S1364-6613(99)01294-2
Goodfellow, I.J., Mirza, M., Xiao, D., Courville, A., Bengio, Y.: An empirical investigation of catastrophic forgetting in gradient-based neural networks (2015)
Grossberg, S.: How does a brain build a cognitive code? Psychol. Rev. 87(1), 1–51 (1980). https://doi.org/10.1037/0033-295X.87.1.1
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Hochreiter, S.: Untersuchungen zu dynamischen neuronalen netzen (1991)
Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. PNAS 114(13), 3521–3526 (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25. Curran Associates, Inc. (2012). https://doi.org/10.1145/3065386
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2323 (1998). https://doi.org/10.1109/5.726791
Lesort, T., et al.: Continual learning for robotics: definition, framework, learning strategies, opportunities and challenges. Inf. Fusion. 58, 52–68 (2020). https://doi.org/10.1016/j.inffus.2019.12.004
Lomonaco, V., Maltoni, D.: CORe50: a new dataset and benchmark for continuous object recognition. In: Proceedings of the 1st Annual Conference on Robot Learning, vol. 78, pp. 17–26 (2017)
Lopez-Paz, D., Ranzato, M.: Gradient episodic memory for continual learning. Adv. Neural Inf. Process. Syst. 30, 6468–6477 (2017)
McClelland, J.L., McNaughton, B.L., O’Reilly, R.C.: Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol. Rev. 102, 419–457 (1995). https://doi.org/10.1037/0033-295X.102.3.419
Rebuffi, S.A., Kolesnikov, A., Sperl, G., Lampert, C.H.: iCaRL: incremental classifier and representation learning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017). https://doi.org/10.1109/cvpr.2017.587
Rusu, A.A., et al.: Progressive Neural Networks. arXiv (2016)
Shin, H., Lee, J.K., Kim, J., Kim, J.: Continual learning with deep generative replay. In: Guyon, I., (eds.) et al. Advances in Neural Information Processing Systems, vol. 30, pp. 2990–2999. Curran Associates, Inc. (2017)
Sun, F.K., Ho, C.H., Lee, H.Y.: LAMOL: LAnguage MOdeling for Lifelong Language Learning. In: ICLR (2020)
Thrun, S.: A lifelong learning perspective for mobile robot control. In: Graefe, V. (ed.) Intelligent Robots and Systems, pp. 201–214. Elsevier Science B.V., Amsterdam (1995). https://doi.org/10.1016/B978-044482250-5/50015-3
van de Ven, G.M., Siegelmann, H.T., Tolias, A.S.: Brain-inspired replay for continual learning with artificial neural networks. Nat. Commun. 11, 4069 (2020). https://doi.org/10.1038/s41467-020-17866-2
van de Ven, G.M., Tolias, A.S.: Three scenarios for continual learning. arXiv (2019)
Wang, T., Zhu, J.Y., Torralba, A., Efros, A.A.: Dataset distillation. arXiv (2018)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv (2017)
Zenke, F., Poole, B., Ganguli, S.: Continual learning through synaptic intelligence. In: International Conference on Machine Learning, pp. 3987–3995 (2017)
Zhao, B., Bilen, H.: Dataset condensation with differentiable Siamese augmentation (2021)
Zhao, B., Mopuri, K.R., Bilen, H.: Dataset condensation with gradient matching (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Rosasco, A., Carta, A., Cossu, A., Lomonaco, V., Bacciu, D. (2022). Distilled Replay: Overcoming Forgetting Through Synthetic Samples. In: Cuzzolin, F., Cannons, K., Lomonaco, V. (eds) Continual Semi-Supervised Learning. CSSL 2021. Lecture Notes in Computer Science(), vol 13418. Springer, Cham. https://doi.org/10.1007/978-3-031-17587-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-17587-9_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17586-2
Online ISBN: 978-3-031-17587-9
eBook Packages: Computer ScienceComputer Science (R0)