Continual Neural Computation

Tiezzi, Matteo; Marullo, Simone; Becattini, Federico; Melacci, Stefano

doi:10.1007/978-3-031-70344-7_20

Matteo Tiezzi¹³,
Simone Marullo^14,15,
Federico Becattini¹⁴ &
…
Stefano Melacci¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14942))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

798 Accesses

Abstract

Continuously processing a stream of not-i.i.d. data by neural models with the goal of progressively learning new skills is largely known to introduce significant challenges, frequently leading to catastrophic forgetting. In this paper we tackle this problem focusing on the low-level aspects of the neural computation model, differently from the most common existing approaches. We propose a novel neuron model, referred to as Continual Neural Unit (CNU), which does not only compute a response to an input pattern, but also diversifies computations to preserve what was previously learned, while being plastic enough to adapt to new knowledge. The values attached to weights are the outcome of a computational process which depends on the neuron input, implemented by a key-value map to select and blend multiple sets of learnable memory units. This computational mechanism implements a natural, learnable form of soft parameter isolation, virtually defining multiple computational paths within each neural unit. We show that such a computational scheme is related to the ones of popular models which perform computations relying on a set of samples stored in a memory buffer, including Kernel Machines and Transformers. Experiments in class-and-domain incremental streams processed in online and single-pass manner show how CNUs can mitigate forgetting without any replays or more informed learning criteria, while keeping competitive or better performance with respect to continual learning methods that explicitly store and replay data.

M. Tiezzi—Work done while working at DIISM, University of Siena, Italy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Three types of incremental learning

Article Open access 05 December 2022

Online Continual Learning on Sequences

Continual Learning for Classification Problems: A Survey

Notes

1.
That is what we did in our experiments. There is room for investigating the way $\psi $ could be defined—beyond the scope of this paper. For example, if x is an image/feature map, $\psi (x)$ could be a down-scaling operation.
2.
The case in which $w'x$ is zero must be specifically handled. As a matter of fact, it is the point which requires a special handling also in ReLUs, to deal with the discontinuity in the first derivative.

References

Continual Neural Computation supplementary material. https://github.com/sailab-code/continual_neural_unit/blob/main/supplementary_material.pdf
Abati, D., Tomczak, J., Blankevoort, T., Calderara, S., Cucchiara, R., Bejnordi, B.E.: Conditional channel gated networks for task-aware continual learning. In: Conference on Computer Vision and Pattern Recognition, pp. 3931–3940 (2020)
Google Scholar
Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M., Tuytelaars, T.: Memory aware synapses: learning what (not) to forget. In: Proceedings of the European conference on computer vision (ECCV), pp. 139–154 (2018)
Google Scholar
Aljundi, R., et al.: Online continual learning with maximal interfered retrieval. Adv. Neural Inf. Process. Syst. 32 (2019)
Google Scholar
Aljundi, R., Chakravarty, P., Tuytelaars, T.: Expert gate: lifelong learning with a network of experts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3366–3375 (2017)
Google Scholar
Aljundi, R., Kelchtermans, K., Tuytelaars, T.: Task-free continual learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Aljundi, R., Lin, M., Goujaud, B., Bengio, Y.: Gradient based sample selection for online continual learning. Adv. Neural Inf. Process. Syst. 32 (2019)
Google Scholar
Aljundi, R., Rohrbach, M., Tuytelaars, T.: Selfless sequential learning. arXiv preprint arXiv:1806.05421 (2018)
Ayub, A., Wagner, A.R.: Cognitively-inspired model for incremental learning using a few examples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 222–223 (2020)
Google Scholar
Betti, A., Gori, M., Melacci, S.: Deep Learning to See: Towards New Foundations of Computer Vision. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-030-90987-1
Book Google Scholar
Bricken, T., Davies, X., Singh, D., Krotov, D., Kreiman, G.: Sparse distributed memory is a continual learner. arXiv preprint arXiv:2303.11934 (2023)
Buzzega, P., Boschini, M., Porrello, A., Abati, D., Calderara, S.: Dark experience for general continual learning: a strong, simple baseline. Adv. Neural. Inf. Process. Syst. 33, 15920–15930 (2020)
Google Scholar
Cai, Z., Sener, O., Koltun, V.: Online continual learning with natural distribution shifts: an empirical study with visual data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8281–8290 (2021)
Google Scholar
Chaudhry, A., Ranzato, M., Rohrbach, M., Elhoseiny, M.: Efficient lifelong learning with a-gem. In: International Conference on Learning Representations (2019)
Google Scholar
Chrysakis, A., Moens, M.F.: Online continual learning from imbalanced data. In: International Conference on Machine Learning, pp. 1952–1961. PMLR (2020)
Google Scholar
Dauphin, Y.N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional networks. In: International Conference on Machine Learning, pp. 933–941. PMLR (2017)
Google Scholar
De, S., et al.: Griffin: mixing gated linear recurrences with local attention for efficient language models. arXiv preprint arXiv:2402.19427 (2024)
De Lange, M., Tuytelaars, T.: Continual prototype evolution: Learning online from non-stationary data streams. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8250–8259 (2021)
Google Scholar
Delange, M., et al.: A continual learning survey: defying forgetting in classification tasks. IEEE Trans. Pattern Anal. Mach. Intell. 44, 3366–3385 (2021). https://doi.org/10.1109/TPAMI.2021.3057446
Article Google Scholar
Ermis, B., Zappella, G., Wistuba, M., Rawal, A., Archambeau, C.: Memory efficient continual learning with transformers. Adv. Neural. Inf. Process. Syst. 35, 10629–10642 (2022)
Google Scholar
Gnecco, G., Gori, M., Melacci, S., Sanguineti, M.: Foundations of support constraint machines. Neural Comput. 27(2), 388–480 (2015)
Article MathSciNet Google Scholar
Hara, K., Saito, D., Shouno, H.: Analysis of function of rectified linear unit used in deep learning. In: International Joint Conference on Neural Networks, pp. 1–8. IEEE (2015)
Google Scholar
Iyer, A., Grewal, K., Velu, A., Souza, L.O., Forest, J., Ahmad, S.: Avoiding catastrophe: active dendrites enable multi-task learning in dynamic environments. Front. Neurorobot. 16, 846219 (2022)
Article Google Scholar
Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40, 2935–2947 (2017)
Article Google Scholar
Lin, M., Fu, J., Bengio, Y.: Conditional computation for continual learning. arXiv preprint arXiv:1906.06635 (2019)
Lopez-Paz, D., Ranzato, M.: Gradient episodic memory for continual learning. Adv. Neural Inf. Process. Syst. 30 (2017)
Google Scholar
Mai, Z., Li, R., Jeong, J., Quispe, D., Kim, H., Sanner, S.: Online continual learning in image classification: an empirical survey. Neurocomputing 469, 28–51 (2022)
Article Google Scholar
Mirzadeh, S.I., et al.: Wide neural networks forget less catastrophically. In: International Conference on Machine Learning, pp. 15699–15717. PMLR (2022)
Google Scholar
Mirzadeh, S.I., et al.: Architecture matters in continual learning. arXiv preprint arXiv:2202.00275 (2022)
Mittal, S., Galesso, S., Brox, T.: Essentials for class incremental learning. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 3513–3522 (2021)
Google Scholar
Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., Wermter, S.: Continual lifelong learning with neural networks: a review. Neural Netw. 113, 54–71 (2019)
Article Google Scholar
Ren, M., Scott, T.R., Iuzzolino, M.L., Mozer, M.C., Zemel, R.: Online unsupervised learning of visual representations and categories. arXiv:2109.05675 (2021)
Rolnick, D., Ahuja, A., Schwarz, J., Lillicrap, T., Wayne, G.: Experience replay for continual learning. Adv. Neural Inf. Process. Syst. 32 (2019)
Google Scholar
Schölkopf, B., Smola, A.J., Bach, F., et al.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, Cambridge (2002)
Google Scholar
Shazeer, N., et al.: Outrageously large neural networks: the sparsely-gated mixture-of-experts layer. In: International Conference on Learning Representations. OpenReview.net (2017). https://openreview.net/forum?id=B1ckMDqlg
Shim, D., Mai, Z., Jeong, J., Sanner, S., Kim, H., Jang, J.: Online class-incremental continual learning with adversarial shapley value. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 9630–9638 (2021)
Google Scholar
Srivastava, R.K., Masci, J., Kazerounian, S., Gomez, F., Schmidhuber, J.: Compete to compute. Adv. Neural Inf. Process. Syst. 26 (2013)
Google Scholar
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. (2017)
Google Scholar
Wang, L., Zhang, X., Su, H., Zhu, J.: A comprehensive survey of continual learning: theory, method and application. IEEE Trans. Pattern Anal. Mach. Intell. 46, 5362–5383 (2024)
Article Google Scholar
Wang, Z., et al.: Learning to prompt for continual learning. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 139–149 (2022)
Google Scholar
Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853 (2015)
Zhang, J., et al.: Attacks which do not kill training make adversarial learning stronger. In: International Conference on Machine Learning. PMLR (2020)
Google Scholar
Zhang, Y., Pfahringer, B., Frank, E., Bifet, A., Lim, N.J.S., Jia, A.: A simple but strong baseline for online continual learning: repeated augmented rehearsal. Adv. Neural Inf. Process. Syst. (2022)
Google Scholar
Zhong, S.: Efficient online spherical k-means clustering. In: Proceedings of IEEE International Joint Conference on Neural Networks, 2005, vol. 5, pp. 3180–3185. IEEE (2005)
Google Scholar
Zhou, D.W., Wang, Q.W., Ye, H.J., Zhan, D.C.: A model or 603 exemplars: towards memory-efficient class-incremental learning. In: International Conference on Learning Representations (2023). https://openreview.net/forum?id=S07feAlQHgM
Zhou, K., Yang, J., Loy, C.C., Liu, Z.: Learning to prompt for vision-language models (2022). https://openreview.net/forum?id=OgCcfc1m0TO
Zhou, Z.H.: Ensemble Methods: Foundations and Algorithms. CRC Press, Boca Raton (2012)
Book Google Scholar

Download references

Acknowledgements

This work was supported by the Italian Ministry of Research, under the complementary actions to the NRRP"Fit4MedRob-Fit for Medical Robotics" Grant (# PNC0000007).

Author information

Authors and Affiliations

PAVIS, Istituto Italiano di Tecnologia, Genoa, Italy
Matteo Tiezzi
DIISM, University of Siena, Siena, Italy
Simone Marullo, Federico Becattini & Stefano Melacci
DINFO, University of Florence, Florence, Italy
Simone Marullo

Authors

Matteo Tiezzi
View author publications
You can also search for this author in PubMed Google Scholar
Simone Marullo
View author publications
You can also search for this author in PubMed Google Scholar
Federico Becattini
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Melacci
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matteo Tiezzi .

Editor information

Editors and Affiliations

LTCI, Télécom Paris, Palaiseau Cedex, France
Albert Bifet
KU Leuven, Leuven, Belgium
Jesse Davis
Faculty of Informatics, Vytautas Magnus University, Akademija, Lithuania
Tomas Krilavičius
Institute of Computer Science, University of Tartu, Tartu, Estonia
Meelis Kull
Department of Computer Science, Bundeswehr University Munich, Munich, Germany
Eirini Ntoutsi
Department of Computer Science, University of Helsinki, Helsinki, Finland
Indrė Žliobaitė

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 611 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tiezzi, M., Marullo, S., Becattini, F., Melacci, S. (2024). Continual Neural Computation. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14942. Springer, Cham. https://doi.org/10.1007/978-3-031-70344-7_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-70344-7_20
Published: 22 August 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70343-0
Online ISBN: 978-3-031-70344-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)