Abstract
Meta-learning approaches have recently achieved promising performance in multi-class incremental learning. However, meta-learners still suffer from catastrophic forgetting, i.e., they tend to forget the learned knowledge from the old tasks when they focus on rapidly adapting to the new classes of the current task. To solve this problem, we propose a novel distilled meta-learning (DML) framework for multi-class incremental learning that integrates seamlessly meta-learning with knowledge distillation in each incremental stage. Specifically, during inner-loop training, knowledge distillation is incorporated into the DML to overcome catastrophic forgetting. During outer-loop training, a meta-update rule is designed for the meta-learner to learn across tasks and quickly adapt to new tasks. By virtue of the bilevel optimization, our model is encouraged to reach a balance between the retention of old knowledge and the learning of new knowledge. Experimental results on four benchmark datasets demonstrate the effectiveness of our proposal and show that our method significantly outperforms other state-of-the-art incremental learning methods.
- [1] . 2018. Memory aware synapses: Learning what (not) to forget. In Proceedings of the European Conference on Computer Vision (ECCV’18), , , , and (Eds.). Springer International Publishing, Cham, 144–161. Google ScholarDigital Library
- [2] . 1990. Learning a Synaptic Learning Rule. Citeseer.Google Scholar
- [3] . 2018. End-to-end incremental learning. In Proceedings of the European Conference on Computer Vision. 241–257.Google ScholarDigital Library
- [4] . 2018. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In Proceedings of the European Conference on Computer Vision (ECCV’18). 532–547.Google ScholarDigital Library
- [5] . 2021. Meta-learning-based incremental few-shot object detection. IEEE Trans Circ. Syst. Vid. Technol. 32, 4 (2021), 2158–2169.Google ScholarCross Ref
- [6] . 2022. MetaFSCIL: A meta-learning approach for few-shot class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14166–14175.Google ScholarCross Ref
- [7] . 2015. Branch-specific dendritic Ca2+ spikes cause persistent synaptic plasticity. Nature 520, 7546 (2015), 180–185.Google ScholarCross Ref
- [8] . 2017. Gradient episodic memory for continual learning. In Advances in Neural Information Processing Systems. 6470–6479.Google Scholar
- [9] . 2019. Learning without memorizing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 5133–5141.
DOI: Google ScholarCross Ref - [10] . 2015. Learning in nonstationary environments: A survey. IEEE Comput. Intell. Mag. 10, 4 (2015), 12–25.Google ScholarDigital Library
- [11] . 2021. PLOP: Learning without forgetting for continual semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 4040–4050.Google ScholarCross Ref
- [12] . 2015. Distilling the knowledge in a neural network. Comput. Sci. 14, 7 (2015), 38–39.Google Scholar
- [13] . 2019. Learning a unified classifier incrementally via rebalancing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 831–839.
DOI: Google ScholarCross Ref - [14] . 2019. Meta-learning representations for continual learning. In Advances in Neural Information Processing Systems, Vol. 32. 1–15.Google Scholar
- [15] . 2017. Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. U.S.A. 114, 13 (
March 2017), 3521–3526.DOI: Google ScholarCross Ref - [16] . 2021. Incremental object detection via meta-learning. IEEE Trans. Pattern Anal. Mach. Intell. (2021), 1–11.Google Scholar
- [17] . 2009. Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Technical Report 1 (
01 2009).Google Scholar - [18] . 2002. On the power of incremental learning. Theor. Comput. Sci. 2, 288 (2002), 277–307.Google ScholarDigital Library
- [19] . 2010. The mnist Database of Handwritten Digits. Retrieved from http://yann.lecun.com/exdb/mnist/.Google Scholar
- [20] . 2017. Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40, 12 (2017), 2935–2947.Google ScholarDigital Library
- [21] . 2019. On the variance of the adaptive learning rate and beyond. arXiv:1908.03265. Retrieved from https://arxiv.org/abs/1908.03265.Google Scholar
- [22] . 2018. Rotate your Networks: Better weight consolidation and less catastrophic forgetting. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR’18). 2262–2268.
DOI: Google ScholarCross Ref - [23] . 2021. Structural knowledge organization and transfer for class-incremental learning. In Proceedings of the ACM Multimedia Asia (MMAsia’21). Association for Computing Machinery, New York, NY, Article
18 , 7 pages.DOI: Google ScholarDigital Library - [24] . 2021. Adaptive aggregation networks for class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2544–2553.Google ScholarCross Ref
- [25] . 2020. Mnemonics Training: Multi-class incremental learning without forgetting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). IEEE, 12245–12254.
DOI: Google ScholarCross Ref - [26] . 1989. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation. Elsevier, 109–165.
DOI: Google ScholarCross Ref - [27] . 2013. The stability-plasticity dilemma: Investigating the continuum from catastrophic forgetting to age-limited learning effects. Front. Psychol. 4, 504 (2013), 1–3.
DOI: Google ScholarCross Ref - [28] . 2009. Learn\(^{++}\).NC: Combining ensemble of classifiers with dynamically weighted consult-and-vote for efficient incremental learning of new classes. IEEE Trans. Neural Netw. 20, 1 (2009), 152–168.Google ScholarDigital Library
- [29] . 2011. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011. 1–9.Google Scholar
- [30] . 2018. On first-order meta-learning algorithms. arXiv:1803.02999. Retrieved from https://arxiv.org/abs/1803.02999.Google Scholar
- [31] . 2018. Reptile: A scalable metalearning algorithm. arXiv:1803.02999. Retrieved from https://arxiv.org/abs/1803.02999.Google Scholar
- [32] . 2019. Random path selection for incremental learning. Advances in Neural Information Processing Systems (2019), 1–11.Google Scholar
- [33] . 2020. iTAML: An incremental task-agnostic meta-learning approach. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). IEEE, 13588–13597.
DOI: Google ScholarCross Ref - [34] . 2017. iCaRL: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 5533–5542.
DOI: Google ScholarCross Ref - [35] . 2001. Incremental learning with support vector machines. In Proceedings of the IEEE International Conference on Data Mining. 641–642.
DOI: Google ScholarCross Ref - [36] . 2015. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 3 (2015), 211–252.Google ScholarDigital Library
- [37] . 1992. Learning to control fast-weight memories: An alternative to dynamic recurrent networks. Neural Comput. 4, 1 (1992), 131–139.Google ScholarDigital Library
- [38] . 2017. Continual learning with deep generative replay. In Advances in Neural Information Processing Systems, , , , , , , and (Eds.), Vol. 30. 2290–2999.Google Scholar
- [39] . 2022. Graph few-shot class-incremental learning. In Proceedings of the 15th ACM International Conference on Web Search and Data Mining. 987–996.Google ScholarDigital Library
- [40] . 2022. Incremental meta-learning via episodic replay distillation for few-shot image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3729–3739.Google ScholarCross Ref
- [41] . 2019. Large scale incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 374–382.
DOI: Google ScholarCross Ref - [42] . 2022. RD-IOD: Two-level residual-distillation-based triple-network for incremental object detection. ACM Trans. Multimedia Comput. Commun. Appl. 18, 1 (2022), 1–23.Google ScholarDigital Library
- [43] . 2017. Continual learning through synaptic intelligence. Int. Conf. Mach. Learn. 70 (2017), 3987–3995.Google Scholar
- [44] . 2020. Class-incremental learning via deep model consolidation. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’20). IEEE, 1131–1140.
DOI: Google ScholarCross Ref - [45] . 2021. Co-transport for class-incremental learning. In Proceedings of the 29th ACM International Conference on Multimedia. 1645–1654.Google ScholarDigital Library
Index Terms
- Distilled Meta-learning for Multi-Class Incremental Learning
Recommendations
Incremental Learning Based on Dual-Branch Network
Pattern Recognition and Computer VisionAbstractIncremental learning aims to overcome catastrophic forgetting. When the model learns multiple tasks sequentially, due to the imbalance of new and old classes numbers, the knowledge of old classes stored in the model is destroyed by large number of ...
Incremental learning with neural networks for computer vision: a survey
AbstractIncremental learning is one of the most important abilities of human beings. In the age of artificial intelligence, it is the key task to make neural network models as powerful as human beings, to achieve the ability to continuously acquire, fine-...
Incremental learning by heterogeneous bagging ensemble
ADMA'10: Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part IIClassifier ensemble is a main direction of incremental learning researches, and many ensemble-based incremental learning methods have been presented. Among them, Learn++, which is derived from the famous ensemble algorithm, AdaBoost, is special. Learn++ ...
Comments