Abstract
Continuously processing a stream of not-i.i.d. data by neural models with the goal of progressively learning new skills is largely known to introduce significant challenges, frequently leading to catastrophic forgetting. In this paper we tackle this problem focusing on the low-level aspects of the neural computation model, differently from the most common existing approaches. We propose a novel neuron model, referred to as Continual Neural Unit (CNU), which does not only compute a response to an input pattern, but also diversifies computations to preserve what was previously learned, while being plastic enough to adapt to new knowledge. The values attached to weights are the outcome of a computational process which depends on the neuron input, implemented by a key-value map to select and blend multiple sets of learnable memory units. This computational mechanism implements a natural, learnable form of soft parameter isolation, virtually defining multiple computational paths within each neural unit. We show that such a computational scheme is related to the ones of popular models which perform computations relying on a set of samples stored in a memory buffer, including Kernel Machines and Transformers. Experiments in class-and-domain incremental streams processed in online and single-pass manner show how CNUs can mitigate forgetting without any replays or more informed learning criteria, while keeping competitive or better performance with respect to continual learning methods that explicitly store and replay data.
M. Tiezzi—Work done while working at DIISM, University of Siena, Italy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
That is what we did in our experiments. There is room for investigating the way \(\psi \) could be defined—beyond the scope of this paper. For example, if x is an image/feature map, \(\psi (x)\) could be a down-scaling operation.
- 2.
The case in which \(w'x\) is zero must be specifically handled. As a matter of fact, it is the point which requires a special handling also in ReLUs, to deal with the discontinuity in the first derivative.
References
Continual Neural Computation supplementary material. https://github.com/sailab-code/continual_neural_unit/blob/main/supplementary_material.pdf
Abati, D., Tomczak, J., Blankevoort, T., Calderara, S., Cucchiara, R., Bejnordi, B.E.: Conditional channel gated networks for task-aware continual learning. In: Conference on Computer Vision and Pattern Recognition, pp. 3931–3940 (2020)
Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M., Tuytelaars, T.: Memory aware synapses: learning what (not) to forget. In: Proceedings of the European conference on computer vision (ECCV), pp. 139–154 (2018)
Aljundi, R., et al.: Online continual learning with maximal interfered retrieval. Adv. Neural Inf. Process. Syst. 32 (2019)
Aljundi, R., Chakravarty, P., Tuytelaars, T.: Expert gate: lifelong learning with a network of experts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3366–3375 (2017)
Aljundi, R., Kelchtermans, K., Tuytelaars, T.: Task-free continual learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Aljundi, R., Lin, M., Goujaud, B., Bengio, Y.: Gradient based sample selection for online continual learning. Adv. Neural Inf. Process. Syst. 32 (2019)
Aljundi, R., Rohrbach, M., Tuytelaars, T.: Selfless sequential learning. arXiv preprint arXiv:1806.05421 (2018)
Ayub, A., Wagner, A.R.: Cognitively-inspired model for incremental learning using a few examples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 222–223 (2020)
Betti, A., Gori, M., Melacci, S.: Deep Learning to See: Towards New Foundations of Computer Vision. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-030-90987-1
Bricken, T., Davies, X., Singh, D., Krotov, D., Kreiman, G.: Sparse distributed memory is a continual learner. arXiv preprint arXiv:2303.11934 (2023)
Buzzega, P., Boschini, M., Porrello, A., Abati, D., Calderara, S.: Dark experience for general continual learning: a strong, simple baseline. Adv. Neural. Inf. Process. Syst. 33, 15920–15930 (2020)
Cai, Z., Sener, O., Koltun, V.: Online continual learning with natural distribution shifts: an empirical study with visual data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8281–8290 (2021)
Chaudhry, A., Ranzato, M., Rohrbach, M., Elhoseiny, M.: Efficient lifelong learning with a-gem. In: International Conference on Learning Representations (2019)
Chrysakis, A., Moens, M.F.: Online continual learning from imbalanced data. In: International Conference on Machine Learning, pp. 1952–1961. PMLR (2020)
Dauphin, Y.N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional networks. In: International Conference on Machine Learning, pp. 933–941. PMLR (2017)
De, S., et al.: Griffin: mixing gated linear recurrences with local attention for efficient language models. arXiv preprint arXiv:2402.19427 (2024)
De Lange, M., Tuytelaars, T.: Continual prototype evolution: Learning online from non-stationary data streams. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8250–8259 (2021)
Delange, M., et al.: A continual learning survey: defying forgetting in classification tasks. IEEE Trans. Pattern Anal. Mach. Intell. 44, 3366–3385 (2021). https://doi.org/10.1109/TPAMI.2021.3057446
Ermis, B., Zappella, G., Wistuba, M., Rawal, A., Archambeau, C.: Memory efficient continual learning with transformers. Adv. Neural. Inf. Process. Syst. 35, 10629–10642 (2022)
Gnecco, G., Gori, M., Melacci, S., Sanguineti, M.: Foundations of support constraint machines. Neural Comput. 27(2), 388–480 (2015)
Hara, K., Saito, D., Shouno, H.: Analysis of function of rectified linear unit used in deep learning. In: International Joint Conference on Neural Networks, pp. 1–8. IEEE (2015)
Iyer, A., Grewal, K., Velu, A., Souza, L.O., Forest, J., Ahmad, S.: Avoiding catastrophe: active dendrites enable multi-task learning in dynamic environments. Front. Neurorobot. 16, 846219 (2022)
Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40, 2935–2947 (2017)
Lin, M., Fu, J., Bengio, Y.: Conditional computation for continual learning. arXiv preprint arXiv:1906.06635 (2019)
Lopez-Paz, D., Ranzato, M.: Gradient episodic memory for continual learning. Adv. Neural Inf. Process. Syst. 30 (2017)
Mai, Z., Li, R., Jeong, J., Quispe, D., Kim, H., Sanner, S.: Online continual learning in image classification: an empirical survey. Neurocomputing 469, 28–51 (2022)
Mirzadeh, S.I., et al.: Wide neural networks forget less catastrophically. In: International Conference on Machine Learning, pp. 15699–15717. PMLR (2022)
Mirzadeh, S.I., et al.: Architecture matters in continual learning. arXiv preprint arXiv:2202.00275 (2022)
Mittal, S., Galesso, S., Brox, T.: Essentials for class incremental learning. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 3513–3522 (2021)
Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., Wermter, S.: Continual lifelong learning with neural networks: a review. Neural Netw. 113, 54–71 (2019)
Ren, M., Scott, T.R., Iuzzolino, M.L., Mozer, M.C., Zemel, R.: Online unsupervised learning of visual representations and categories. arXiv:2109.05675 (2021)
Rolnick, D., Ahuja, A., Schwarz, J., Lillicrap, T., Wayne, G.: Experience replay for continual learning. Adv. Neural Inf. Process. Syst. 32 (2019)
Schölkopf, B., Smola, A.J., Bach, F., et al.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, Cambridge (2002)
Shazeer, N., et al.: Outrageously large neural networks: the sparsely-gated mixture-of-experts layer. In: International Conference on Learning Representations. OpenReview.net (2017). https://openreview.net/forum?id=B1ckMDqlg
Shim, D., Mai, Z., Jeong, J., Sanner, S., Kim, H., Jang, J.: Online class-incremental continual learning with adversarial shapley value. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 9630–9638 (2021)
Srivastava, R.K., Masci, J., Kazerounian, S., Gomez, F., Schmidhuber, J.: Compete to compute. Adv. Neural Inf. Process. Syst. 26 (2013)
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. (2017)
Wang, L., Zhang, X., Su, H., Zhu, J.: A comprehensive survey of continual learning: theory, method and application. IEEE Trans. Pattern Anal. Mach. Intell. 46, 5362–5383 (2024)
Wang, Z., et al.: Learning to prompt for continual learning. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 139–149 (2022)
Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853 (2015)
Zhang, J., et al.: Attacks which do not kill training make adversarial learning stronger. In: International Conference on Machine Learning. PMLR (2020)
Zhang, Y., Pfahringer, B., Frank, E., Bifet, A., Lim, N.J.S., Jia, A.: A simple but strong baseline for online continual learning: repeated augmented rehearsal. Adv. Neural Inf. Process. Syst. (2022)
Zhong, S.: Efficient online spherical k-means clustering. In: Proceedings of IEEE International Joint Conference on Neural Networks, 2005, vol. 5, pp. 3180–3185. IEEE (2005)
Zhou, D.W., Wang, Q.W., Ye, H.J., Zhan, D.C.: A model or 603 exemplars: towards memory-efficient class-incremental learning. In: International Conference on Learning Representations (2023). https://openreview.net/forum?id=S07feAlQHgM
Zhou, K., Yang, J., Loy, C.C., Liu, Z.: Learning to prompt for vision-language models (2022). https://openreview.net/forum?id=OgCcfc1m0TO
Zhou, Z.H.: Ensemble Methods: Foundations and Algorithms. CRC Press, Boca Raton (2012)
Acknowledgements
This work was supported by the Italian Ministry of Research, under the complementary actions to the NRRP"Fit4MedRob-Fit for Medical Robotics" Grant (# PNC0000007).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tiezzi, M., Marullo, S., Becattini, F., Melacci, S. (2024). Continual Neural Computation. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14942. Springer, Cham. https://doi.org/10.1007/978-3-031-70344-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-70344-7_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70343-0
Online ISBN: 978-3-031-70344-7
eBook Packages: Computer ScienceComputer Science (R0)