research-article

Incremental Learning via Robust Parameter Posterior Fusion

Authors:

Yangliao GengAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 4292 - 4301

https://doi.org/10.1145/3664647.3681164

Published: 28 October 2024 Publication History

Abstract

The posterior estimation of parameters based on Bayesian theory is a crucial technique in Incremental Learning (IL). The estimated posterior is typically utilized to impose loss regularization, which aligns the current training model parameters with the previously learned posterior to mitigate catastrophic forgetting, a major challenge in IL. However, this additional loss regularization can also impose detriment to the model learning, preventing it from reaching the true global optimum. To overcome this limitation, this paper introduces a novel Bayesian IL framework, Robust Parameter Posterior Fusion (RP²F). Unlike traditional methods, RP2F directly estimates the parameter posterior for new data without introducing extra loss regularization, which allows the model to accommodate new knowledge more sufficiently. It then fuses this new posterior with the existing ones based on the Maximum A Posteriori (MAP) principle, ensuring effective knowledge sharing across tasks. Furthermore, RP²F incorporates a common parameter-robustness priori to facilitate a seamless integration during posterior fusion. Comprehensive experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets show that RP²F not only effectively mitigates catastrophic forgetting but also achieves backward knowledge transfer.

References

[1]

Rahaf Aljundi, Francesca Babiloni, Mohamed Elhoseiny, Marcus Rohrbach, and Tinne Tuytelaars. 2018. Memory aware synapses: Learning what (not) to forget. In European Conference on Computer Vision. 139--154.

Digital Library

[2]

Elahe Arani, Fahad Sarfraz, and Bahram Zonooz. 2022. Learning fast, learning slow: a general continual learning method based on complementary learning system. In International Conference on Learning Representations. https://openreview.net/forum?id=uxxFrDwrE7Y

[3]

Jacopo Bonato, Francesco Pelosin, Luigi Sabetta, and Alessandro Nicolosi. 2024. MIND: Multi-task incremental network distillation. (2024).

[4]

Lorenzo Bonicelli, Matteo Boschini, Angelo Porrello, Concetto Spampinato, and Simone Calderara. 2022. On the effectiveness of lipschitz-driven rehearsal in continual learning. In Advances in Neural Information Processing Systems.

[5]

Matteo Boschini, Lorenzo Bonicelli, Pietro Buzzega, Angelo Porrello, and Simone Calderara. 2023. Class-incremental continual learning into the eXtended DER-verse. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, 5 (2023), 5497--5512.

Digital Library

[6]

Pietro Buzzega, Matteo Boschini, Angelo Porrello, Davide Abati, and SIMONE CALDERARA. 2020. Dark experience for general continual learning: A strong, simple baseline. In Advances in Neural Information Processing Systems, Vol. 33. 15920--15930.

[7]

Lucas Caccia, Rahaf Aljundi, Nader Asadi, Tinne Tuytelaars, Joelle Pineau, and Eugene Belilovsky. 2022. New insights on reducing abrupt representation change in online continual learning. In International Conference on Learning Representations.

[8]

Arslan Chaudhry, Marc'Aurelio Ranzato, Marcus Rohrbach, and Mohamed Elhoseiny. 2019. Efficient lifelong learning with A-GEM. In International Conference on Learning Representations.

[9]

Hung-Jen Chen, An-Chieh Cheng, Da-Cheng Juan, Wei Wei, and Min Sun. 2020. Mitigating Forgetting in Online Continual Learning via Instance-Aware Parameterization. In Advances in Neural Information Processing Systems, Vol. 33. 17466--17477.

[10]

M. Delange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars. 2021. A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).

[11]

Prithviraj Dhar, Rajat Vikram Singh, Kuan-Chuan Peng, Ziyan Wu, and Rama Chellappa. 2019. Learning without memorizing. In IEEE Conference on Computer Vision and Pattern Recognition.

[12]

Arthur Douillard, Matthieu Cord, Charles Ollion, Thomas Robert, and Eduardo Valle. 2020. Podnet: Pooled outputs distillation for small-tasks incremental learning. In European Conference on Computer Vision. 86--102.

Digital Library

[13]

Ruili Feng, Kecheng Zheng, Yukun Huang, Deli Zhao, Michael Jordan, and Zheng-Jun Zha. 2022. Rank diminishing in deep neural networks. In Advances in Neural Information Processing Systems, Vol. 35. 33054--33065.

[14]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM, Vol. 63, 11 (2020), 139--144.

Digital Library

[15]

Ian J Goodfellow, Mehdi Mirza, Da Xiao, Aaron Courville, and Yoshua Bengio. 2013. An empirical investigation of catastrophic forgetting in gradient-based neural networks. arXiv preprint arXiv:1312.6211 (2013).

[16]

Ching-Yi Hung, Cheng-Hao Tu, Cheng-En Wu, Chien-Hung Chen, Yi-Ming Chan, and Chu-Song Chen. 2019. Compacting, picking and growing for unforgetting continual learning. In Advances in Neural Information Processing Systems, Vol. 32. 13647--13657.

[17]

Ta-Chu Kao, Kristopher Jensen, Gido van de Ven, Alberto Bernacchia, and Guillaume Hennequin. 2021. Natural continual learning: success is a journey, not (just) a destination. In Advances in Neural Information Processing Systems, Vol. 34. 28067--28079.

[18]

Sanghwan Kim, Lorenzo Noci, Antonio Orvieto, and Thomas Hofmann. 2023. Achieving a better stability-plasticity trade-Off via auxiliary networks in continual learning. In IEEE Conference on Computer Vision and Pattern Recognition. 11930--11939.

[19]

James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, Vol. 114, 13 (2017), 3521--3526.

[20]

Tatsuya Konishi, Mori Kurokawa, Chihiro Ono, Zixuan Ke, Gyuhak Kim, and Bing Liu. 2023. Parameter-level soft-masking for continual learning. In International Conference on Machine Learning, Vol. 202. 17492--17505.

[21]

Alex Krizhevsky. 2012. Learning multiple layers of features from tiny images. University of Toronto (05 2012).

[22]

Sang-Woo Lee, Jin-Hwa Kim, Jaehyun Jun, Jung-Woo Ha, and Byoung-Tak Zhang. 2017. Overcoming catastrophic forgetting by incremental moment matching. In Advances in Neural Information Processing Systems, Vol. 30.

[23]

Xiaorong Li, Shipeng Wang, Jian Sun, and Zongben Xu. 2023. Memory efficient data-free distillation for continual learning. Pattern Recognition, Vol. 144 (2023), 109875.

Digital Library

[24]

Xiaorong Li, Shipeng Wang, Jian Sun, and Zongben Xu. 2023. Variational data-free knowledge distillation for continual learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, 10 (2023), 1--17.

Digital Library

[25]

Zhizhong Li and Derek Hoiem. 2017. Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, 12 (2017), 2935--2947.

Digital Library

[26]

Yu Liu, Sarah Parisot, Gregory Slabaugh, Xu Jia, Ales Leonardis, and Tinne Tuytelaars. 2020. More classifiers, less forgetting: A generic multi-classifier paradigm for incremental learning. In European Conference on Computer Vision. 699--716.

Digital Library

[27]

David Lopez-Paz and Marctextquotesingle Aurelio Ranzato. 2017. Gradient episodic memory for continual learning. In Advances in Neural Information Processing Systems, Vol. 30. 6467--6476.

[28]

David J. C. MacKay. 1992. A practical bayesian framework for backpropagation networks. Neural Computation, Vol. 4, 3 (05 1992), 448--472.

Digital Library

[29]

Simone Magistri, Tomaso Trinci, Albin Soutif, Joost van de Weijer, and Andrew D. Bagdanov. 2024. Elastic feature consolidation for cold start exemplar-free incremental learning. In International Conference on Learning Representations.

[30]

Zheda Mai, Ruiwen Li, Jihwan Jeong, David Quispe, Hyunwoo Kim, and Scott Sanner. 2022. Online continual learning in image classification: An empirical survey. Neurocomputing, Vol. 469 (2022), 28--51.

Digital Library

[31]

Arun Mallya and Svetlana Lazebnik. 2018. Packnet: Adding multiple tasks to a single network by iterative pruning. In IEEE Conference on Computer Vision and Pattern Recognition. 7765--7773.

[32]

Michael McCloskey and Neal J Cohen. 1989. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation. Vol. 24. Elsevier, 109--165.

[33]

Zichen Miao, Ze Wang, Wei Chen, and Qiang Qiu. 2022. Continual learning with filter atom swapping. In International Conference on Learning Representations.

[34]

Cuong V. Nguyen, Yingzhen Li, Thang D. Bui, and Richard E. Turner. 2018. Variational continual learning. In International Conference on Learning Representations.

[35]

Quang Pham, Chenghao Liu, and Steven Hoi. 2021. DualNet: continual learning, fast and slow. In Advances in Neural Information Processing Systems, Vol. 34. 16131--16144.

[36]

Hippolyt Ritter, Aleksandar Botev, and David Barber. 2018. Online structured laplace approximations for overcoming catastrophic forgetting. In Advances in Neural Information Processing Systems, Vol. 31.

[37]

Grzegorz Rypeść, Sebastian Cygert, Valeriya Khan, Tomasz Trzcinski, Bartosz Michał Zieliński, and Bartłomiej Twardowski. 2024. Divide and not forget: Ensemble of selectively trained experts in continual learning. In International Conference on Learning Representations.

[38]

Gobinda Saha, Isha Garg, and Kaushik Roy. 2021. Gradient projection memory for continual learning. In International Conference on Learning Representations.

[39]

Fahad Sarfraz, Elahe Arani, and Bahram Zonooz. 2023. Sparse coding in a dual memory system for lifelong learning. In AAAI Conference on Artificial Intelligence, Vol. 37. 9714--9722.

Digital Library

[40]

Dayana Savostianova, Emanuele Zangrando, Gianluca Ceruti, and Francesco Tudisco. 2023. Robust low-rank training via approximate orthonormal constraints. In Advances in Neural Information Processing Systems, Vol. 36. 66064--66083.

[41]

Tom Schaul, Sixin Zhang, and Yann LeCun. 2013. No more pesky learning rates. In International Conference on Machine Learning, Vol. 28. 343--351.

[42]

Tom Schaul, Sixin Zhang, and Yann LeCun. 2013. No more pesky learning rates. In International Conference on Machine Learning, Vol. 28. 343--351.

[43]

Jonathan Schwarz, Wojciech Czarnecki, Jelena Luketina, Agnieszka Grabska-Barwinska, Yee Whye Teh, Razvan Pascanu, and Raia Hadsell. 2018. Progress & compress: A scalable framework for continual learning. In International Conference on Machine Learning. 4528--4537.

[44]

Wuxuan Shi and Mang Ye. 2023. Prototype reminiscence and augmented asymmetric knowledge aggregation for non-exemplar class-incremental learning. In IEEE International Conference on Computer Vision. 1772--1781.

[45]

Yujun Shi, Kuangqi Zhou, Jian Liang, Zihang Jiang, Jiashi Feng, Philip H.S. Torr, Song Bai, and Vincent Y. F. Tan. 2022. Mimicking the oracle: An initial phase decorrelation approach for class incremental learning. In IEEE Conference on Computer Vision and Pattern Recognition. 16722--16731.

[46]

Hanul Shin, Jung Kwon Lee, Jaehong Kim, and Jiwon Kim. 2017. Continual learning with deep generative replay. In Advances in Neural Information Processing Systems, Vol. 30.

Digital Library

[47]

James Smith, Yen-Chang Hsu, Jonathan Balloch, Yilin Shen, Hongxia Jin, and Zsolt Kira. 2021. Always be dreaming: A new approach for data-free class-incremental learning. In IEEE International Conference on Computer Vision.

[48]

Stanford. 2015. Tiny imageNet challenge (CS231n). http://tiny-imagenet.herokuapp.com/. (2015). http://tiny-imagenet.herokuapp.com/

[49]

Wenju Sun, Qingyong Li, Wen Wang, and Yangli-ao Geng. 2023. Towards plastic and stable exemplar-free incremental learning: A dual-learner framework with cumulative parameter averaging. arXiv preprint arXiv:2310.18639 (2023).

[50]

Wenju Sun, Qingyong Li, Jing Zhang, Danyu Wang, Wen Wang, and YangLi ao Geng. 2023. Exemplar-free class incremental learning via discriminative and comparable parallel one-class classifiers. Pattern Recognition, Vol. 140 (2023), 109561.

Digital Library

[51]

Wenju Sun, Qingyong Li, Jing Zhang, Wen Wang, and Yangli-ao Geng. 2023. Decoupling learning and remembering: A bilevel memory rramework with knowledge projection for task-incremental learning. In IEEE Conference on Computer Vision and Pattern Recognition.

[52]

Wenju Sun, Jing Zhang, Danyu Wang, Yangli-ao Geng, and Qingyong Li. 2021. ILCOC: An incremental learning framework based on contrastive one-class classifiers. In IEEE Conference on Computer Vision and Pattern Recognition Workshops. 3580--3588.

[53]

Filip Szatkowski, Mateusz Pyla, Marcin Przewikęźlikowski, Sebastian Cygert, Bartłomiej Twardowski, and Tomasz Trzciński. 2024. Adapt your teacher: Improving knowledge distillation for exemplar-free continual learning. In IEEE Winter Conference on Applications of Computer Vision. 1977--1987.

[54]

Xiaoyu Tao, Xinyuan Chang, Xiaopeng Hong, Xing Wei, and Yihong Gong. 2020. Topology-preserving class-incremental learning. In European Conference on Computer Vision. Springer, 254--270.

Digital Library

[55]

Gido M Van de Ven and Andreas S Tolias. 2019. Three scenarios for continual learning. arXiv preprint arXiv:1904.07734 (2019).

[56]

Vinay Kumar Verma, Kevin J Liang, Nikhil Mehta, Piyush Rai, and Lawrence Carin. 2021. Efficient feature transformations for discriminative and generative continual learning. In IEEE Conference on Computer Vision and Pattern Recognition. 13865--13875.

[57]

Liyuan Wang, Mingtian Zhang, Zhongfan Jia, Qian Li, Chenglong Bao, Kaisheng Ma, Jun Zhu, and Yi Zhong. 2021. AFEC: Active forgetting of negative transfer in continual learning. In Advances in Neural Information Processing Systems, Vol. 34. 22379--22391.

[58]

Shipeng Wang, Xiaorong Li, Jian Sun, and Zongben Xu. 2021. Training networks in null space of feature covariance for continual learning. In IEEE Conference on Computer Vision and Pattern Recognition. 184--193.

[59]

Ju Xu and Zhanxing Zhu. 2018. Reinforced continual learning. In Advances in Neural Information Processing Systems, Vol. 31. 907--916.

[60]

Hongxu Yin, Pavlo Molchanov, Jose M. Alvarez, Zhizhong Li, Arun Mallya, Derek Hoiem, Niraj K. Jha, and Jan Kautz. 2020. Dreaming to distill: data-free knowledge transfer via deepinversion. In IEEE Conference on Computer Vision and Pattern Recognition.

[61]

Jaehong Yoon, Eunho Yang, Jeongtae Lee, and Sung Ju Hwang. 2018. Lifelong learning with dynamically expandable networks. In International Conference on Learning Representations.

[62]

Guanxiong Zeng, Yang Chen, Bo Cui, and Shan Yu. 2019. Continual learning of context-dependent processing in neural networks. Nature Machine Intelligence, Vol. 1, 8 (2019), 364--372.

[63]

Friedemann Zenke, Ben Poole, and Surya Ganguli. 2017. Continual learning through synaptic intelligence. In International Conference on Machine Learning. 3987--3995.

Digital Library

[64]

Fei Zhu, Zhen Cheng, Xu-yao Zhang, and Cheng-lin Liu. 2021. Class-incremental learning via dual augmentation. In Advances in Neural Information Processing Systems, Vol. 34. 14306--14318.

[65]

Fei Zhu, Xu-Yao Zhang, Chuang Wang, Fei Yin, and Cheng-Lin Liu. 2021. Prototype augmentation and self-supervision for incremental learning. In IEEE Conference on Computer Vision and Pattern Recognition. 5871--5880.

[66]

Kai Zhu, Wei Zhai, Yang Cao, Jiebo Luo, and Zheng-Jun Zha. 2022. Self-sustaining representation expansion for non-exemplar class-incremental learning. In IEEE Conference on Computer Vision and Pattern Recognition. 9296--9305.

Index Terms

Incremental Learning via Robust Parameter Posterior Fusion
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
  2. Machine learning
    1. Learning paradigms
      1. Multi-task learning
        Lifelong machine learning
    2. Machine learning approaches
      1. Learning in probabilistic graphical models
        Bayesian network models

Recommendations

Incremental learning with neural networks for computer vision: a survey
Abstract
Incremental learning is one of the most important abilities of human beings. In the age of artificial intelligence, it is the key task to make neural network models as powerful as human beings, to achieve the ability to continuously acquire, fine-...
Towards Lifelong Learning of Large Language Models: A Survey
As the applications of large language models (LLMs) expand across diverse fields, their ability to adapt to ongoing changes in data, tasks, and user preferences becomes crucial. Traditional training methods with static datasets are inadequate for coping ...
Incremental learning using generative-rehearsal strategy for fault detection and classification
Highlights
- We propose a generative-rehearsal strategy for class incremental learning.
- We ...
Abstract
In this study, we propose a novel pseudorehearsal method for modeling fault detection and classification. As manufacturing processes become increasingly advanced, it is often necessary to model the architecture when the data change ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Langfang R&D Project
Fundamental Research Funds for the Central Universities
Beijing Natural Science Foundation
Open Grants of Key Laboratory of Lightning
China Railway Taiyuan Group Co., LTD R&D Project

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
51
Total Downloads

Downloads (Last 12 months)51
Downloads (Last 6 weeks)14

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten