skip to main content
10.1145/3511808.3557375acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Latent Coreset Sampling based Data-Free Continual Learning

Published:17 October 2022Publication History

ABSTRACT

Catastrophic forgetting poses a major challenge in continual learning where the old knowledge is forgotten when the model is updated on new tasks. Existing solutions tend to solve this challenge through generative models or exemplar-replay strategies. However, such methods may not alleviate the issue that the low-quality samples are generated or selected for the replay, which would directly reduce the effectiveness of the model, especially in the class imbalance, noise, or redundancy scenarios. Accordingly, how to select a suitable coreset during continual learning becomes significant in such setting. In this work, we propose a novel approach that leverages continual coreset sampling (CCS) to address these challenges. We aim to select the most representative subsets during each iteration. When the model is trained on new tasks, it closely approximates/matches the gradient of both the previous and current tasks with respect to the model parameters. This way, adaptation of the model to new datasets could be more efficient. Furthermore, different from the old data storage for maintaining the old knowledge, our approach choose to preserving them in the latent space. We augment the previous classes in the embedding space as the pseudo sample vectors from the old encoder output, strengthened by the joint training with selected new data. It could avoid data privacy invasions in a real-world application when we update the model. Our experiments validate the effectiveness of our proposed approach over various CV/NLP datasets under against current baselines, and we also indicate the obvious improvement of model adaptation and forgetting reduction in a data-free manner.

Skip Supplemental Material Section

Supplemental Material

CIKM_fp0475_Presentation.mp4

mp4

246.1 MB

References

  1. Rahaf Aljundi, Francesca Babiloni, Mohamed Elhoseiny, Marcus Rohrbach, and Tinne Tuytelaars. 2018. Memory Aware Synapses: Learning What (not) to Forget. In Proceedings of the 15th European Conference on Computer Vision (ECCV), Part III. Munich, Germany, 144--161.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Rahaf Aljundi, Min Lin, Baptiste Goujaud, and Yoshua Bengio. 2019. Gradient based sample selection for online continual learning. (2019), 11816--11825.Google ScholarGoogle Scholar
  3. Chaitanya Baweja, Ben Glocker, and Konstantinos Kamnitsas. 2018. Towards continual learning in medical imaging. arXiv preprint arXiv:1811.02496 (2018).Google ScholarGoogle Scholar
  4. Zalá n Borsos, Mojmir Mutny, and Andreas Krause. 2020. Coresets via Bilevel Optimization for Continual Learning and Streaming. In Advances in Neural Information Processing Systems (NeurIPS). virtual.Google ScholarGoogle Scholar
  5. Pietro Buzzega, Matteo Boschini, Angelo Porrello, Davide Abati, and Simone Calderara. 2020. Dark Experience for General Continual Learning: a Strong, Simple Baseline. In Advances in Neural Information Processing Systems (NeurIPS). virtual.Google ScholarGoogle Scholar
  6. Arslan Chaudhry, Marc'Aurelio Ranzato, Marcus Rohrbach, and Mohamed Elhoseiny. 2019. Efficient Lifelong Learning with A-GEM. In Proceedings of the 7th International Conference on Learning Representations (ICLR). New Orleans, LA.Google ScholarGoogle Scholar
  7. Yung-Sung Chuang, Shang-Yu Su, and Yun-Nung Chen. 2020. Lifelong Language Knowledge Distillation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Online, 2914--2924.Google ScholarGoogle ScholarCross RefCross Ref
  8. Cyprien de Masson D'Autume, Sebastian Ruder, Lingpeng Kong, and Dani Yogatama. 2019. Episodic memory in lifelong language learning. NeurIPS, Vol. 32 (2019).Google ScholarGoogle Scholar
  9. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). Minneapolis, MN, 4171--4186.Google ScholarGoogle Scholar
  10. Songlin Dong, Xiaopeng Hong, Xiaoyu Tao, Xinyuan Chang, Xing Wei, and Yihong Gong. 2021. Few-Shot Class-Incremental Learning via Relation Knowledge Distillation. In Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI). Virtual Event, 1255--1263.Google ScholarGoogle ScholarCross RefCross Ref
  11. Enrico Fini, Stéphane Lathuilière, Enver Sangineto, Moin Nabi, and Elisa Ricci. 2020. Online Continual Learning Under Extreme Memory Constraints. In Proceedings of the 16th European Conference on Computer Vision (ECCV), Part XXVIII. Glasgow, UK, 720--735.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ross B. Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Columbus, OH, 580--587.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Geof H Givens and Jennifer A Hoeting. 2012. Computational statistics. Vol. 703. John Wiley & Sons.Google ScholarGoogle Scholar
  14. Junfeng Guo and Cong Liu. 2020. Practical Poisoning Attacks on Neural Networks. In Proceedings of the 16th European Conference on Computer Vision (ECCV), Part XXVII. Glasgow UK, 142--158.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  16. Byeongho Heo, Minsik Lee, Sangdoo Yun, and Jin Young Choi. 2019. Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI). Honolulu, HI, 3779--3787.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).Google ScholarGoogle Scholar
  18. Saihui Hou, Xinyu Pan, Chen Change Loy, Zilei Wang, and Dahua Lin. 2018. Lifelong Learning via Progressive Distillation and Retrospection. In Proceedings of the 15th European Conference on Computer Vision (ECCV), Part III. Munich, Germany, 452--467.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Saihui Hou, Xinyu Pan, Chen Change Loy, Zilei Wang, and Dahua Lin. 2019. Learning a unified classifier incrementally via rebalancing. In CVPR. 831--839.Google ScholarGoogle Scholar
  20. Yufan Huang, Yanzhe Zhang, Jiaao Chen, Xuezhi Wang, and Diyi Yang. 2021. Continual learning for text classification with information disentanglement based regularization. NAACL (2021).Google ScholarGoogle Scholar
  21. Angelos Katharopoulos and Francc ois Fleuret. 2018. Not all samples are created equal: Deep learning with importance sampling. In ICML. PMLR, 2525--2534.Google ScholarGoogle Scholar
  22. KrishnaTeja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, Abir De, and Rishabh K. Iyer. 2021b. GRAD-MATCH: Gradient Matching based Data Subset Selection for Efficient Deep Model Training. (2021), 5464--5474.Google ScholarGoogle Scholar
  23. KrishnaTeja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, and Rishabh K. Iyer. 2021a. GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning. In Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI). Virtual Event, 8110--8118.Google ScholarGoogle Scholar
  24. James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences (2017), 201611835.Google ScholarGoogle ScholarCross RefCross Ref
  25. Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Technical Report. Citeseer.Google ScholarGoogle Scholar
  26. Dingcheng Li, Zheng Chen, Eunah Cho, Jie Hao, Xiaohu Liu, Fan Xing, Chenlei Guo, and Yang Liu. 2022. Overcoming Catastrophic Forgetting During Domain Adaptation of Seq2seq Language Generation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 5441--5454.Google ScholarGoogle ScholarCross RefCross Ref
  27. Yuanpeng Li, Liang Zhao, Kenneth Church, and Mohamed Elhoseiny. 2020. Compositional Language Continual Learning. In Proceedings of the 8th International Conference on Learning Representations (ICLR). Addis Ababa, Ethiopia.Google ScholarGoogle Scholar
  28. Zhizhong Li and Derek Hoiem. 2018. Learning without Forgetting. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 40, 12 (2018), 2935--2947.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Michael McCloskey and Neal J Cohen. 1989. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation. Vol. 24. Elsevier, 109--165.Google ScholarGoogle Scholar
  30. German Ignacio Parisi, Ronald Kemker, Jose L. Part, Christopher Kanan, and Stefan Wermter. 2019. Continual lifelong learning with neural networks: A review. Neural Networks, Vol. 113 (2019), 54--71.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H. Lampert. 2017. iCaRL: Incremental Classifier and Representation Learning. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, 5533--5542.Google ScholarGoogle Scholar
  32. Kan Ren, Jiarui Qin, Yuchen Fang, Weinan Zhang, Lei Zheng, Weijie Bian, Guorui Zhou, Jian Xu, Yong Yu, Xiaoqiang Zhu, and Kun Gai. 2019. Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Paris, France, 565--574.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ozan Sener and Silvio Savarese. 2018. Active Learning for Convolutional Neural Networks: A Core-Set Approach. In Proceedings of the 6th International Conference on Learning Representations (ICLR). Vancouver, Canada.Google ScholarGoogle Scholar
  34. Hanul Shin, Jung Kwon Lee, Jaehong Kim, and Jiwon Kim. 2017. Continual Learning with Deep Generative Replay. In Advances in Neural Information Processing Systems (NIPS). Long Beach, CA, 2990--2999.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Fan-Keng Sun, Cheng-Hao Ho, and Hung-Yi Lee. 2020. LAMOL: LAnguage MOdeling for Lifelong Language Learning. In Proceedings of the 8th International Conference on Learning Representations (ICLR). Addis Ababa, Ethiopia.Google ScholarGoogle Scholar
  36. Gido M van de Ven and Andreas S Tolias. 2019. Three scenarios for continual learning. arXiv preprint arXiv:1904.07734 (2019).Google ScholarGoogle Scholar
  37. Hong Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, and William Yang Wang. 2019b. Sentence embedding alignment for lifelong relation extraction. NAACL (2019).Google ScholarGoogle Scholar
  38. Xinshao Wang, Yang Hua, Elyor Kodirov, Guosheng Hu, and Neil M Robertson. 2019a. Deep metric learning by online soft mining and class-aware attention. In AAAI, Vol. 33. 5361--5368.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Yigong Wang, Zhuoyi Wang, Yu Lin, Latifur Khan, and Dingcheng Li. 2021b. CIFDM: Continual and Interactive Feature Distillation for Multi-Label Stream Learning. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Virtual Event, Canada, 2121--2125.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Zhuoyi Wang, Yuqiao Chen, Chen Zhao, Yu Lin, Xujiang Zhao, Hemeng Tao, Yigong Wang, and Latifur Khan. 2021a. CLEAR: Contrastive-Prototype Learning with Drift Estimation for Resource Constrained Stream Mining. In Proceedings of the Web Conference (WWW). Virtual Event / Ljubljana, Slovenia, 1351--1362.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Kai Wei, Rishabh K. Iyer, and Jeff A. Bilmes. 2015. Submodularity in Data Subset Selection and Active Learning. In Proceedings of the 32nd International Conference on Machine Learning (ICML). Lille, France, 1954--1963.Google ScholarGoogle Scholar
  42. Hongxu Yin, Pavlo Molchanov, Jose M. Alvarez, Zhizhong Li, Arun Mallya, Derek Hoiem, Niraj K. Jha, and Jan Kautz. 2020. Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, 8712--8721.Google ScholarGoogle ScholarCross RefCross Ref
  43. Jaehong Yoon, Divyam Madaan, Eunho Yang, and Sung Ju Hwang. 2022. Online Coreset Selection for Rehearsal-based Continual Learning. In Proceedings of the Tenth International Conference on Learning Representations (ICLR). Virtual Event.Google ScholarGoogle Scholar
  44. Lu Yu, Bartlomiej Twardowski, Xialei Liu, Luis Herranz, Kai Wang, Yongmei Cheng, Shangling Jui, and Joost van de Weijer. 2020. Semantic Drift Compensation for Class-Incremental Learning. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, 6980--6989.Google ScholarGoogle Scholar
  45. Friedemann Zenke, Ben Poole, and Surya Ganguli. 2017. Continual Learning Through Synaptic Intelligence. In Proceedings of the 34th International Conference on Machine Learning (ICML). Sydney, Australia, 3987--3995.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Xiang Zhang, Junbo Jake Zhao, and Yann LeCun. 2015. Character-level Convolutional Networks for Text Classification. In Advances in Neural Information Processing Systems (NIPS). Montreal, Canada, 649--657.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Fei Zhu, Xu-Yao Zhang, Chuang Wang, Fei Yin, and Cheng-Lin Liu. 2021. Prototype augmentation and self-supervision for incremental learning. In CVPR. 5871--5880.Google ScholarGoogle Scholar

Index Terms

  1. Latent Coreset Sampling based Data-Free Continual Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management
      October 2022
      5274 pages
      ISBN:9781450392365
      DOI:10.1145/3511808
      • General Chairs:
      • Mohammad Al Hasan,
      • Li Xiong

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 October 2022

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CIKM '22 Paper Acceptance Rate621of2,257submissions,28%Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader