Abstract
Achieving knowledge transfer is a new challenge in continual learning, which requires the model to not only overcome catastrophic forgetting but also make full use of the knowledge from multiple tasks to solve a particular task. In this paper, we propose to disentangle representations in continual learning into task-shared and task-specific representations, using shared and task-specific encoders to obtain the corresponding disentangled representations, respectively. However, forgetting persists in the shared encoder, and task encoders are unable to transfer knowledge to each other. In order to overcome forgetting in the shared encoder, we introduce the Fisher mask to limit the update of important parameters of old tasks. Further, a multi-knowledge distillation method is proposed to promote the shared encoder to consolidate the knowledge of all tasks by interacting with task encoders. To facilitate knowledge transfer among task encoders, we select and fuse knowledge from the task encoder of each old task to apply to the new task. Experiments show that our proposed method achieves performance beyond state-of-the-art baselines using a smaller network without replay data.
Z. Xu and Q. Qin—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Belouadah, E., Popescu, A.: Il2m: class incremental learning with dual memory. In: ICCV, pp. 583–592 (2019)
Buzzega, P., Boschini, M., Porrello, A., Abati, D., Calderara, S.: Dark experience for general continual learning: a strong, simple baseline. In: NIPS, vol. 33, pp. 15920–15930 (2020)
Castro, F.M., Marín-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-end incremental learning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 241–257. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_15
Chaudhry, A., Ranzato, M., Rohrbach, M., Elhoseiny, M.: Efficient lifelong learning with a-gem. In: ICLR (2019)
Chaudhry, A., et al.: Continual learning with tiny episodic memories. arXiv preprint arXiv:1902.10486 2019 (2019)
Chen, X., Chang, X.: Dynamic residual classifier for class incremental learning. In: ICCV, pp. 18743–18752 (2023)
Dhar, P., Singh, R.V., Peng, K.C., Wu, Z., Chellappa, R.: Learning without memorizing. In: CVPR, pp. 5138–5146 (2019)
Ebrahimi, S., Meier, F., Calandra, R., Darrell, T., Rohrbach, M.: Adversarial continual learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 386–402. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_23
Feng, T., Wang, M., Yuan, H.: Overcoming catastrophic forgetting in incremental object detection via elastic response distillation. In: CVPR, pp. 9427–9436 (2022)
Hayes, T.L., Kafle, K., Shrestha, R., Acharya, M., Kanan, C.: REMIND your neural network to prevent catastrophic forgetting. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 466–483. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_28
Hu, W., et al.: Overcoming catastrophic forgetting for continual learning via model adaptation. In: ICLR (2019)
Jeeveswaran, K., Bhat, P.S., Zonooz, B., Arani, E.: BIRT: bio-inspired replay in vision transformers for continual learning. In: ICML, pp. 14817–14835. PMLR (2023)
Kemker, R., Kanan, C.: FearNet: brain-inspired model for incremental learning. In: ICLR (2018)
Kirkpatrick, J., et al.: Overcoming Catastrophic Forgetting in Neural Networks. vol. 114, pp. 3521–3526. National Acad Sciences (2017)
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. In: Handbook of Systemic Autoimmune Diseases, vol. 1, no. 4 (2009)
Lin, H., Zhang, B., Feng, S., Li, X., Ye, Y.: PCR: proxy-based contrastive replay for online class-incremental continual learning. In: CVPR, pp. 24246–24255 (2023)
Lin, S., Yang, L., Fan, D., Zhang, J.: TRGP: trust region gradient projection for continual learning. In: ICLR (2021)
Lin, S., Yang, L., Fan, D., Zhang, J.: Beyond not-forgetting: continual learning with backward knowledge transfer. In: NIPS, vol. 35, pp. 16165–16177 (2022)
Liu, Y., Su, Y., Liu, A.A., Schiele, B., Sun, Q.: Mnemonics training: multi-class incremental learning without forgetting. In: CVPR, pp. 12245–12254 (2020)
Lopez-Paz, D., Ranzato, M.: Gradient episodic memory for continual learning. In: NIPS, vol. 30 (2017)
McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. In: Psychology of Learning & Motivation, vol. 24 (1989)
Ostapenko, O., Puscas, M., Klein, T., Jahnichen, P., Nabi, M.: Learning to remember: a synaptic plasticity driven framework for continual learning. In: CVPR, pp. 11321–11329 (2019)
Qin, Q., Hu, W., Peng, H., Zhao, D., Liu, B.: BNS: building network structures dynamically for continual learning. In: NIPS, vol. 34, pp. 20608–20620 (2021)
Rolnick, D., Ahuja, A., Schwarz, J., Lillicrap, T., Wayne, G.: Experience replay for continual learning. In: NIPS, vol. 32 (2019)
Rusu, A.A., et al.: Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016)
Saha, G., Garg, I., Roy, K.: Gradient projection memory for continual learning. In: ICLR (2020)
Serra, J., Suris, D., Miron, M., Karatzoglou, A.: Overcoming catastrophic forgetting with hard attention to the task. In: ICML, pp. 4548–4557. PMLR (2018)
Sun, W., Li, Q., Zhang, J., Wang, W., Geng, Y.A.: Decoupling learning and remembering: a bilevel memory framework with knowledge projection for task-incremental learning. In: CVPR, pp. 20186–20195 (2023)
Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al.: Matching networks for one shot learning. In: NIPS, vol. 29 (2016)
Wang, F.Y., et al.: Beef: bi-compatible class-incremental learning via energy-based expansion and fusion. In: ICLR (2022)
Wang, F.Y., Zhou, D.W., Ye, H.J., Zhan, D.C.: Foster: feature boosting and compression for class-incremental learning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13685, pp. 398–414. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19806-9_23
Wang, L., Zhang, X., Su, H., Zhu, J.: A comprehensive survey of continual learning: theory, method and application. IEEE Trans. Pattern Anal. Mach. Intell. 46, 5362–5383 (2024)
Wu, C., Herranz, L., Liu, X., van de Weijer, J., Raducanu, B., et al.: Memory replay gans: Learning to generate new categories without forgetting. In: NIPS, pp. 5962–5972 (2018)
Yan, S., Xie, J., He, X.: DER: dynamically expandable representation for class incremental learning. In: CVPR, pp. 3014–3023 (2021)
Yoon, J., Kim, S., Yang, E., Hwang, S.J.: Scalable and order-robust continual learning with additive parameter decomposition. In: ICLR (2020)
Zenke, F., Poole, B., Ganguli, S.: Continual learning through synaptic intelligence. In: ICML, pp. 3987–3995. PMLR (2017)
Zhao, Z., et al.: Rethinking gradient projection continual learning: stability/plasticity feature space decoupling. In: CVPR, pp. 3718–3727 (2023)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Xu, Z., Qin, Q., Liu, B., Zhao, D. (2024). Disentangled Representations for Continual Learning: Overcoming Forgetting and Facilitating Knowledge Transfer. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14944. Springer, Cham. https://doi.org/10.1007/978-3-031-70359-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-70359-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70358-4
Online ISBN: 978-3-031-70359-1
eBook Packages: Computer ScienceComputer Science (R0)