Disentangled Representations for Continual Learning: Overcoming Forgetting and Facilitating Knowledge Transfer

Xu, Zhaopeng; Qin, Qi; Liu, Bing; Zhao, Dongyan

doi:10.1007/978-3-031-70359-1_9

Zhaopeng Xu¹³,
Qi Qin¹³,
Bing Liu¹⁶ &
…
Dongyan Zhao^13,14,15

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14944))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

538 Accesses

Abstract

Achieving knowledge transfer is a new challenge in continual learning, which requires the model to not only overcome catastrophic forgetting but also make full use of the knowledge from multiple tasks to solve a particular task. In this paper, we propose to disentangle representations in continual learning into task-shared and task-specific representations, using shared and task-specific encoders to obtain the corresponding disentangled representations, respectively. However, forgetting persists in the shared encoder, and task encoders are unable to transfer knowledge to each other. In order to overcome forgetting in the shared encoder, we introduce the Fisher mask to limit the update of important parameters of old tasks. Further, a multi-knowledge distillation method is proposed to promote the shared encoder to consolidate the knowledge of all tasks by interacting with task encoders. To facilitate knowledge transfer among task encoders, we select and fuse knowledge from the task encoder of each old task to apply to the new task. Experiments show that our proposed method achieves performance beyond state-of-the-art baselines using a smaller network without replay data.

Z. Xu and Q. Qin—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Partially Relaxed Masks for Knowledge Transfer Without Forgetting in Continual Learning

Continual Learning Based on Knowledge Distillation and Representation Learning

Transfer Without Forgetting

References

Belouadah, E., Popescu, A.: Il2m: class incremental learning with dual memory. In: ICCV, pp. 583–592 (2019)
Google Scholar
Buzzega, P., Boschini, M., Porrello, A., Abati, D., Calderara, S.: Dark experience for general continual learning: a strong, simple baseline. In: NIPS, vol. 33, pp. 15920–15930 (2020)
Google Scholar
Castro, F.M., Marín-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-end incremental learning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 241–257. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_15
Chapter Google Scholar
Chaudhry, A., Ranzato, M., Rohrbach, M., Elhoseiny, M.: Efficient lifelong learning with a-gem. In: ICLR (2019)
Google Scholar
Chaudhry, A., et al.: Continual learning with tiny episodic memories. arXiv preprint arXiv:1902.10486 2019 (2019)
Chen, X., Chang, X.: Dynamic residual classifier for class incremental learning. In: ICCV, pp. 18743–18752 (2023)
Google Scholar
Dhar, P., Singh, R.V., Peng, K.C., Wu, Z., Chellappa, R.: Learning without memorizing. In: CVPR, pp. 5138–5146 (2019)
Google Scholar
Ebrahimi, S., Meier, F., Calandra, R., Darrell, T., Rohrbach, M.: Adversarial continual learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 386–402. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_23
Chapter Google Scholar
Feng, T., Wang, M., Yuan, H.: Overcoming catastrophic forgetting in incremental object detection via elastic response distillation. In: CVPR, pp. 9427–9436 (2022)
Google Scholar
Hayes, T.L., Kafle, K., Shrestha, R., Acharya, M., Kanan, C.: REMIND your neural network to prevent catastrophic forgetting. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 466–483. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_28
Chapter Google Scholar
Hu, W., et al.: Overcoming catastrophic forgetting for continual learning via model adaptation. In: ICLR (2019)
Google Scholar
Jeeveswaran, K., Bhat, P.S., Zonooz, B., Arani, E.: BIRT: bio-inspired replay in vision transformers for continual learning. In: ICML, pp. 14817–14835. PMLR (2023)
Google Scholar
Kemker, R., Kanan, C.: FearNet: brain-inspired model for incremental learning. In: ICLR (2018)
Google Scholar
Kirkpatrick, J., et al.: Overcoming Catastrophic Forgetting in Neural Networks. vol. 114, pp. 3521–3526. National Acad Sciences (2017)
Google Scholar
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. In: Handbook of Systemic Autoimmune Diseases, vol. 1, no. 4 (2009)
Google Scholar
Lin, H., Zhang, B., Feng, S., Li, X., Ye, Y.: PCR: proxy-based contrastive replay for online class-incremental continual learning. In: CVPR, pp. 24246–24255 (2023)
Google Scholar
Lin, S., Yang, L., Fan, D., Zhang, J.: TRGP: trust region gradient projection for continual learning. In: ICLR (2021)
Google Scholar
Lin, S., Yang, L., Fan, D., Zhang, J.: Beyond not-forgetting: continual learning with backward knowledge transfer. In: NIPS, vol. 35, pp. 16165–16177 (2022)
Google Scholar
Liu, Y., Su, Y., Liu, A.A., Schiele, B., Sun, Q.: Mnemonics training: multi-class incremental learning without forgetting. In: CVPR, pp. 12245–12254 (2020)
Google Scholar
Lopez-Paz, D., Ranzato, M.: Gradient episodic memory for continual learning. In: NIPS, vol. 30 (2017)
Google Scholar
McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. In: Psychology of Learning & Motivation, vol. 24 (1989)
Google Scholar
Ostapenko, O., Puscas, M., Klein, T., Jahnichen, P., Nabi, M.: Learning to remember: a synaptic plasticity driven framework for continual learning. In: CVPR, pp. 11321–11329 (2019)
Google Scholar
Qin, Q., Hu, W., Peng, H., Zhao, D., Liu, B.: BNS: building network structures dynamically for continual learning. In: NIPS, vol. 34, pp. 20608–20620 (2021)
Google Scholar
Rolnick, D., Ahuja, A., Schwarz, J., Lillicrap, T., Wayne, G.: Experience replay for continual learning. In: NIPS, vol. 32 (2019)
Google Scholar
Rusu, A.A., et al.: Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016)
Saha, G., Garg, I., Roy, K.: Gradient projection memory for continual learning. In: ICLR (2020)
Google Scholar
Serra, J., Suris, D., Miron, M., Karatzoglou, A.: Overcoming catastrophic forgetting with hard attention to the task. In: ICML, pp. 4548–4557. PMLR (2018)
Google Scholar
Sun, W., Li, Q., Zhang, J., Wang, W., Geng, Y.A.: Decoupling learning and remembering: a bilevel memory framework with knowledge projection for task-incremental learning. In: CVPR, pp. 20186–20195 (2023)
Google Scholar
Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al.: Matching networks for one shot learning. In: NIPS, vol. 29 (2016)
Google Scholar
Wang, F.Y., et al.: Beef: bi-compatible class-incremental learning via energy-based expansion and fusion. In: ICLR (2022)
Google Scholar
Wang, F.Y., Zhou, D.W., Ye, H.J., Zhan, D.C.: Foster: feature boosting and compression for class-incremental learning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13685, pp. 398–414. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19806-9_23
Wang, L., Zhang, X., Su, H., Zhu, J.: A comprehensive survey of continual learning: theory, method and application. IEEE Trans. Pattern Anal. Mach. Intell. 46, 5362–5383 (2024)
Article Google Scholar
Wu, C., Herranz, L., Liu, X., van de Weijer, J., Raducanu, B., et al.: Memory replay gans: Learning to generate new categories without forgetting. In: NIPS, pp. 5962–5972 (2018)
Google Scholar
Yan, S., Xie, J., He, X.: DER: dynamically expandable representation for class incremental learning. In: CVPR, pp. 3014–3023 (2021)
Google Scholar
Yoon, J., Kim, S., Yang, E., Hwang, S.J.: Scalable and order-robust continual learning with additive parameter decomposition. In: ICLR (2020)
Google Scholar
Zenke, F., Poole, B., Ganguli, S.: Continual learning through synaptic intelligence. In: ICML, pp. 3987–3995. PMLR (2017)
Google Scholar
Zhao, Z., et al.: Rethinking gradient projection continual learning: stability/plasticity feature space decoupling. In: CVPR, pp. 3718–3727 (2023)
Google Scholar

Download references

Author information

Authors and Affiliations

Wangxuan Institute of Computer Technology, Peking University, Beijing, China
Zhaopeng Xu, Qi Qin & Dongyan Zhao
State Key Laboratory of Media Convergence Production Technology and Systems, Beijing, China
Dongyan Zhao
Artificial Intelligence Institute of Peking University, Beijing, China
Dongyan Zhao
Department of Computer Science, University of Illinois at Chicago, Chicago, USA
Bing Liu

Authors

Zhaopeng Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qi Qin
View author publications
You can also search for this author in PubMed Google Scholar
Bing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Dongyan Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Bing Liu or Dongyan Zhao .

Editor information

Editors and Affiliations

LTCI, Télécom Paris, Palaiseau Cedex, France
Albert Bifet
KU Leuven, Leuven, Belgium
Jesse Davis
Faculty of Informatics, Vytautas Magnus University, Akademija, Lithuania
Tomas Krilavičius
Institute of Computer Science, University of Tartu, Tartu, Estonia
Meelis Kull
Department of Computer Science, Bundeswehr University Munich, Munich, Germany
Eirini Ntoutsi
Department of Computer Science, University of Helsinki, Helsinki, Finland
Indrė Žliobaitė

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, Z., Qin, Q., Liu, B., Zhao, D. (2024). Disentangled Representations for Continual Learning: Overcoming Forgetting and Facilitating Knowledge Transfer. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14944. Springer, Cham. https://doi.org/10.1007/978-3-031-70359-1_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-70359-1_9
Published: 22 August 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70358-4
Online ISBN: 978-3-031-70359-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Disentangled Representations for Continual Learning: Overcoming Forgetting and Facilitating Knowledge Transfer

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Partially Relaxed Masks for Knowledge Transfer Without Forgetting in Continual Learning

Continual Learning Based on Knowledge Distillation and Representation Learning

Transfer Without Forgetting

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Disentangled Representations for Continual Learning: Overcoming Forgetting and Facilitating Knowledge Transfer

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Partially Relaxed Masks for Knowledge Transfer Without Forgetting in Continual Learning

Continual Learning Based on Knowledge Distillation and Representation Learning

Transfer Without Forgetting

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation