skip to main content
10.1145/3511808.3557375acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Latent Coreset Sampling based Data-Free Continual Learning

Published: 17 October 2022 Publication History

Abstract

Catastrophic forgetting poses a major challenge in continual learning where the old knowledge is forgotten when the model is updated on new tasks. Existing solutions tend to solve this challenge through generative models or exemplar-replay strategies. However, such methods may not alleviate the issue that the low-quality samples are generated or selected for the replay, which would directly reduce the effectiveness of the model, especially in the class imbalance, noise, or redundancy scenarios. Accordingly, how to select a suitable coreset during continual learning becomes significant in such setting. In this work, we propose a novel approach that leverages continual coreset sampling (CCS) to address these challenges. We aim to select the most representative subsets during each iteration. When the model is trained on new tasks, it closely approximates/matches the gradient of both the previous and current tasks with respect to the model parameters. This way, adaptation of the model to new datasets could be more efficient. Furthermore, different from the old data storage for maintaining the old knowledge, our approach choose to preserving them in the latent space. We augment the previous classes in the embedding space as the pseudo sample vectors from the old encoder output, strengthened by the joint training with selected new data. It could avoid data privacy invasions in a real-world application when we update the model. Our experiments validate the effectiveness of our proposed approach over various CV/NLP datasets under against current baselines, and we also indicate the obvious improvement of model adaptation and forgetting reduction in a data-free manner.

Supplementary Material

MP4 File (CIKM_fp0475_Presentation.mp4)
The virtual presentation of the paper "Latent Coreset Sampling based Data-Free Continual Learning" (ID No: fp0475)

References

[1]
Rahaf Aljundi, Francesca Babiloni, Mohamed Elhoseiny, Marcus Rohrbach, and Tinne Tuytelaars. 2018. Memory Aware Synapses: Learning What (not) to Forget. In Proceedings of the 15th European Conference on Computer Vision (ECCV), Part III. Munich, Germany, 144--161.
[2]
Rahaf Aljundi, Min Lin, Baptiste Goujaud, and Yoshua Bengio. 2019. Gradient based sample selection for online continual learning. (2019), 11816--11825.
[3]
Chaitanya Baweja, Ben Glocker, and Konstantinos Kamnitsas. 2018. Towards continual learning in medical imaging. arXiv preprint arXiv:1811.02496 (2018).
[4]
Zalá n Borsos, Mojmir Mutny, and Andreas Krause. 2020. Coresets via Bilevel Optimization for Continual Learning and Streaming. In Advances in Neural Information Processing Systems (NeurIPS). virtual.
[5]
Pietro Buzzega, Matteo Boschini, Angelo Porrello, Davide Abati, and Simone Calderara. 2020. Dark Experience for General Continual Learning: a Strong, Simple Baseline. In Advances in Neural Information Processing Systems (NeurIPS). virtual.
[6]
Arslan Chaudhry, Marc'Aurelio Ranzato, Marcus Rohrbach, and Mohamed Elhoseiny. 2019. Efficient Lifelong Learning with A-GEM. In Proceedings of the 7th International Conference on Learning Representations (ICLR). New Orleans, LA.
[7]
Yung-Sung Chuang, Shang-Yu Su, and Yun-Nung Chen. 2020. Lifelong Language Knowledge Distillation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Online, 2914--2924.
[8]
Cyprien de Masson D'Autume, Sebastian Ruder, Lingpeng Kong, and Dani Yogatama. 2019. Episodic memory in lifelong language learning. NeurIPS, Vol. 32 (2019).
[9]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). Minneapolis, MN, 4171--4186.
[10]
Songlin Dong, Xiaopeng Hong, Xiaoyu Tao, Xinyuan Chang, Xing Wei, and Yihong Gong. 2021. Few-Shot Class-Incremental Learning via Relation Knowledge Distillation. In Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI). Virtual Event, 1255--1263.
[11]
Enrico Fini, Stéphane Lathuilière, Enver Sangineto, Moin Nabi, and Elisa Ricci. 2020. Online Continual Learning Under Extreme Memory Constraints. In Proceedings of the 16th European Conference on Computer Vision (ECCV), Part XXVIII. Glasgow, UK, 720--735.
[12]
Ross B. Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Columbus, OH, 580--587.
[13]
Geof H Givens and Jennifer A Hoeting. 2012. Computational statistics. Vol. 703. John Wiley & Sons.
[14]
Junfeng Guo and Cong Liu. 2020. Practical Poisoning Attacks on Neural Networks. In Proceedings of the 16th European Conference on Computer Vision (ECCV), Part XXVII. Glasgow UK, 142--158.
[15]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, 770--778.
[16]
Byeongho Heo, Minsik Lee, Sangdoo Yun, and Jin Young Choi. 2019. Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI). Honolulu, HI, 3779--3787.
[17]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
[18]
Saihui Hou, Xinyu Pan, Chen Change Loy, Zilei Wang, and Dahua Lin. 2018. Lifelong Learning via Progressive Distillation and Retrospection. In Proceedings of the 15th European Conference on Computer Vision (ECCV), Part III. Munich, Germany, 452--467.
[19]
Saihui Hou, Xinyu Pan, Chen Change Loy, Zilei Wang, and Dahua Lin. 2019. Learning a unified classifier incrementally via rebalancing. In CVPR. 831--839.
[20]
Yufan Huang, Yanzhe Zhang, Jiaao Chen, Xuezhi Wang, and Diyi Yang. 2021. Continual learning for text classification with information disentanglement based regularization. NAACL (2021).
[21]
Angelos Katharopoulos and Francc ois Fleuret. 2018. Not all samples are created equal: Deep learning with importance sampling. In ICML. PMLR, 2525--2534.
[22]
KrishnaTeja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, Abir De, and Rishabh K. Iyer. 2021b. GRAD-MATCH: Gradient Matching based Data Subset Selection for Efficient Deep Model Training. (2021), 5464--5474.
[23]
KrishnaTeja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, and Rishabh K. Iyer. 2021a. GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning. In Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI). Virtual Event, 8110--8118.
[24]
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences (2017), 201611835.
[25]
Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Technical Report. Citeseer.
[26]
Dingcheng Li, Zheng Chen, Eunah Cho, Jie Hao, Xiaohu Liu, Fan Xing, Chenlei Guo, and Yang Liu. 2022. Overcoming Catastrophic Forgetting During Domain Adaptation of Seq2seq Language Generation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 5441--5454.
[27]
Yuanpeng Li, Liang Zhao, Kenneth Church, and Mohamed Elhoseiny. 2020. Compositional Language Continual Learning. In Proceedings of the 8th International Conference on Learning Representations (ICLR). Addis Ababa, Ethiopia.
[28]
Zhizhong Li and Derek Hoiem. 2018. Learning without Forgetting. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 40, 12 (2018), 2935--2947.
[29]
Michael McCloskey and Neal J Cohen. 1989. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation. Vol. 24. Elsevier, 109--165.
[30]
German Ignacio Parisi, Ronald Kemker, Jose L. Part, Christopher Kanan, and Stefan Wermter. 2019. Continual lifelong learning with neural networks: A review. Neural Networks, Vol. 113 (2019), 54--71.
[31]
Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H. Lampert. 2017. iCaRL: Incremental Classifier and Representation Learning. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, 5533--5542.
[32]
Kan Ren, Jiarui Qin, Yuchen Fang, Weinan Zhang, Lei Zheng, Weijie Bian, Guorui Zhou, Jian Xu, Yong Yu, Xiaoqiang Zhu, and Kun Gai. 2019. Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Paris, France, 565--574.
[33]
Ozan Sener and Silvio Savarese. 2018. Active Learning for Convolutional Neural Networks: A Core-Set Approach. In Proceedings of the 6th International Conference on Learning Representations (ICLR). Vancouver, Canada.
[34]
Hanul Shin, Jung Kwon Lee, Jaehong Kim, and Jiwon Kim. 2017. Continual Learning with Deep Generative Replay. In Advances in Neural Information Processing Systems (NIPS). Long Beach, CA, 2990--2999.
[35]
Fan-Keng Sun, Cheng-Hao Ho, and Hung-Yi Lee. 2020. LAMOL: LAnguage MOdeling for Lifelong Language Learning. In Proceedings of the 8th International Conference on Learning Representations (ICLR). Addis Ababa, Ethiopia.
[36]
Gido M van de Ven and Andreas S Tolias. 2019. Three scenarios for continual learning. arXiv preprint arXiv:1904.07734 (2019).
[37]
Hong Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, and William Yang Wang. 2019b. Sentence embedding alignment for lifelong relation extraction. NAACL (2019).
[38]
Xinshao Wang, Yang Hua, Elyor Kodirov, Guosheng Hu, and Neil M Robertson. 2019a. Deep metric learning by online soft mining and class-aware attention. In AAAI, Vol. 33. 5361--5368.
[39]
Yigong Wang, Zhuoyi Wang, Yu Lin, Latifur Khan, and Dingcheng Li. 2021b. CIFDM: Continual and Interactive Feature Distillation for Multi-Label Stream Learning. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Virtual Event, Canada, 2121--2125.
[40]
Zhuoyi Wang, Yuqiao Chen, Chen Zhao, Yu Lin, Xujiang Zhao, Hemeng Tao, Yigong Wang, and Latifur Khan. 2021a. CLEAR: Contrastive-Prototype Learning with Drift Estimation for Resource Constrained Stream Mining. In Proceedings of the Web Conference (WWW). Virtual Event / Ljubljana, Slovenia, 1351--1362.
[41]
Kai Wei, Rishabh K. Iyer, and Jeff A. Bilmes. 2015. Submodularity in Data Subset Selection and Active Learning. In Proceedings of the 32nd International Conference on Machine Learning (ICML). Lille, France, 1954--1963.
[42]
Hongxu Yin, Pavlo Molchanov, Jose M. Alvarez, Zhizhong Li, Arun Mallya, Derek Hoiem, Niraj K. Jha, and Jan Kautz. 2020. Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, 8712--8721.
[43]
Jaehong Yoon, Divyam Madaan, Eunho Yang, and Sung Ju Hwang. 2022. Online Coreset Selection for Rehearsal-based Continual Learning. In Proceedings of the Tenth International Conference on Learning Representations (ICLR). Virtual Event.
[44]
Lu Yu, Bartlomiej Twardowski, Xialei Liu, Luis Herranz, Kai Wang, Yongmei Cheng, Shangling Jui, and Joost van de Weijer. 2020. Semantic Drift Compensation for Class-Incremental Learning. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, 6980--6989.
[45]
Friedemann Zenke, Ben Poole, and Surya Ganguli. 2017. Continual Learning Through Synaptic Intelligence. In Proceedings of the 34th International Conference on Machine Learning (ICML). Sydney, Australia, 3987--3995.
[46]
Xiang Zhang, Junbo Jake Zhao, and Yann LeCun. 2015. Character-level Convolutional Networks for Text Classification. In Advances in Neural Information Processing Systems (NIPS). Montreal, Canada, 649--657.
[47]
Fei Zhu, Xu-Yao Zhang, Chuang Wang, Fei Yin, and Cheng-Lin Liu. 2021. Prototype augmentation and self-supervision for incremental learning. In CVPR. 5871--5880.

Cited By

View all
  • (2024)ConfliLPC: Logits and Parameter Calibration for Political Conflict Analysis in Continual Learning2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10826026(6320-6329)Online publication date: 15-Dec-2024
  • (2024)Knowledge transfer in lifelong machine learning: a systematic literature reviewArtificial Intelligence Review10.1007/s10462-024-10853-957:8Online publication date: 26-Jul-2024
  • (2023)Power Norm Based Lifelong Learning for Paraphrase GenerationsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592039(2266-2271)Online publication date: 19-Jul-2023
  • Show More Cited By

Index Terms

  1. Latent Coreset Sampling based Data-Free Continual Learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management
    October 2022
    5274 pages
    ISBN:9781450392365
    DOI:10.1145/3511808
    • General Chairs:
    • Mohammad Al Hasan,
    • Li Xiong
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. continual learning
    2. coreset sampling
    3. data-free
    4. latent space

    Qualifiers

    • Research-article

    Conference

    CIKM '22
    Sponsor:

    Acceptance Rates

    CIKM '22 Paper Acceptance Rate 621 of 2,257 submissions, 28%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)35
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 23 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)ConfliLPC: Logits and Parameter Calibration for Political Conflict Analysis in Continual Learning2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10826026(6320-6329)Online publication date: 15-Dec-2024
    • (2024)Knowledge transfer in lifelong machine learning: a systematic literature reviewArtificial Intelligence Review10.1007/s10462-024-10853-957:8Online publication date: 26-Jul-2024
    • (2023)Power Norm Based Lifelong Learning for Paraphrase GenerationsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592039(2266-2271)Online publication date: 19-Jul-2023
    • (2023)2MiCo: A Contrastive Semi-Supervised Method with Double Mixup for Smart Meter Modbus RS-485 Communication Security2023 IEEE 9th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS)10.1109/BigDataSecurity-HPSC-IDS58521.2023.00016(30-39)Online publication date: May-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media