ABSTRACT
Federated learning (FL) has shown great promise for privacy-preserving learning by enabling collaborative training on decentralized clients. However, in realistic FL scenarios, clients often collect new data continuously, join or exit learning dynamically. As a result, the global model tends to forget old knowledge while learning new knowledge. Meanwhile, labeling the continuously arriving data in real-time is usually challenging. Therefore, the catastrophic forgetting problem intertwined with the label deficiency issue poses significant challenges for both learning new knowledge and consolidating old knowledge. To address these challenges, we develop a novel exemplar-free continual federated learning framework named FedINC, to learn a global incremental model with limited labeled data. We begin by excavating the cause of catastrophic forgetting via in-depth empirical studies. Based on that, we introduce targeted mechanisms for FedINC, including a hybrid contrastive learning mechanism to efficiently learn new knowledge with limited labeled data, a plastic feature regularization mechanism to preserve old task's representation space, a prototype-guided regularization mechanism to mitigate feature overlap between old and new classes while aligning the features of non-iid clients, and a prototype evolution mechanism for flexible and efficient incremental classification. Extensive experiments demonstrate the superior performance of FedINC in terms of both convergence speed and accuracy of the global model.
- Rebecca Adaimi and Edison Thomaz. 2022. Lifelong Adaptive Machine Learning for Sensor-Based Human Activity Recognition Using Prototypical Networks. Sensors 22, 18 (2022), 6881.Google ScholarCross Ref
- Rahaf Aljundi, Francesca Babiloni, Mohamed Elhoseiny, Marcus Rohrbach, and Tinne Tuytelaars. 2018. Memory aware synapses: Learning what (not) to forget. In ECCV. 139--154.Google Scholar
- Eric Arazo, Diego Ortego, Paul Albert, Noel E O'Connor, and Kevin McGuinness. 2020. Pseudo-labeling and confirmation bias in deep semi-supervised learning. In IJCNN. 1--8.Google Scholar
- Manoj Ghuhan Arivazhagan, Vinay Aggarwal, Aaditya Kumar Singh, and Sunav Choudhary. 2019. Federated learning with personalization layers. arXiv preprint arXiv:1912.00818 (2019).Google Scholar
- Jihwan Bang, Heesu Kim, YoungJoon Yoo, Jung-Woo Ha, and Jonghyun Choi. 2021. Rainbow memory: Continual learning with a memory of diverse samples. In IEEE CVPR. 8218--8227.Google Scholar
- David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, and Colin A Raffel. 2019. Mixmatch: A holistic approach to semi-supervised learning. In NeurIPS, Vol. 32.Google Scholar
- Hyuntak Cha, Jaeho Lee, and Jinwoo Shin. 2021. Co2l: Contrastive continual learning. In IEEE ICCV. 9516--9525.Google Scholar
- Arslan Chaudhry, Puneet K Dokania, Thalaiyasingam Ajanthan, and Philip HS Torr. 2018. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In ECCV. 532--547.Google Scholar
- Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In ICML. 1597--1607.Google Scholar
- Gong Cheng, Junwei Han, and Xiaoqiang Lu. 2017. Remote sensing image scene classification: Benchmark and state of the art. Proc. IEEE 105, 10 (2017), 1865--1883.Google ScholarCross Ref
- Gregory Cohen, Saeed Afshar, Jonathan Tapson, and Andre Van Schaik. 2017. EMNIST: Extending MNIST to handwritten letters. In IJCNN. 2921--2926.Google Scholar
- Luke N Darlow, Elliot J Crowley, Antreas Antoniou, and Amos J Storkey. 2018. CINIC-10 is not ImageNet or CIFAR-10. arXiv preprint arXiv:1810.03505 (2018).Google Scholar
- MohammadReza Davari, Nader Asadi, Sudhir Mudur, Rahaf Aljundi, and Eugene Belilovsky. 2022. Probing representation forgetting in supervised and unsupervised continual learning. In IEEE CVPR. 16712--16721.Google Scholar
- Matthias De Lange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Aleš Leonardis, Gregory Slabaugh, and Tinne Tuytelaars. 2021. A continual learning survey: Defying forgetting in classification tasks. IEEE transactions on pattern analysis and machine intelligence 44, 7 (2021), 3366--3385.Google Scholar
- Matthias De Lange and Tinne Tuytelaars. 2021. Continual prototype evolution: Learning online from non-stationary data streams. In IEEE ICCV. 8250--8259.Google Scholar
- Yongheng Deng, Weining Chen, Ju Ren, Feng Lyu, Yang Liu, Yunxin Liu, and Yaoxue Zhang. 2022. TailorFL: Dual-Personalized Federated Learning under System and Data Heterogeneity. In ACM SenSys. 592--606.Google Scholar
- Jiahua Dong, Lixu Wang, Zhen Fang, Gan Sun, Shichao Xu, Xiao Wang, and Qi Zhu. 2022. Federated class-incremental learning. In IEEE CVPR. 10164--10173.Google Scholar
- Chrisantha Fernando, Dylan Banarse, Charles Blundell, Yori Zwols, David Ha, Andrei A Rusu, Alexander Pritzel, and Daan Wierstra. 2017. Pathnet: Evolution channels gradient descent in super neural networks. arXiv preprint arXiv:1701.08734 (2017).Google Scholar
- Enrico Fini, Victor G Turrisi Da Costa, Xavier Alameda-Pineda, Elisa Ricci, Karteek Alahari, and Julien Mairal. 2022. Self-supervised models are continual learners. In IEEE CVPR. 9621--9630.Google Scholar
- Alex Gomez-Villa, Bartlomiej Twardowski, Lu Yu, Andrew D Bagdanov, and Joost van de Weijer. 2022. Continually learning self-supervised representations with projected functional regularization. In IEEE CVPR. 3867--3877.Google Scholar
- Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, et al. 2020. Bootstrap your own latent-a new approach to self-supervised learning. In NeurIPS, Vol. 33. 21271--21284.Google Scholar
- Xinran Gu, Kaixuan Huang, Jingzhao Zhang, and Longbo Huang. 2021. Fast federated learning in the presence of arbitrary device unavailability. In NeurIPS, Vol. 34. 12052--12064.Google Scholar
- Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In IEEE CVPR, Vol. 2. 1735--1742.Google Scholar
- Chaoyang He, Zhengyu Yang, Erum Mushtaq, Sunwoo Lee, Mahdi Soltanolkotabi, and Salman Avestimehr. 2021. SSFL: Tackling label deficiency in federated learning via personalized self-supervision. arXiv preprint arXiv:2110.02470 (2021).Google Scholar
- Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In IEEE CVPR. 9729--9738.Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In IEEE CVPR. 770--778.Google Scholar
- Saihui Hou, Xinyu Pan, Chen Change Loy, Zilei Wang, and Dahua Lin. 2019. Learning a unified classifier incrementally via rebalancing. In IEEE CVPR. 831--839.Google Scholar
- Sebastian Houben, Johannes Stallkamp, Jan Salmen, Marc Schlipsing, and Christian Igel. 2013. Detection of traffic signs in real-world images: The German Traffic Sign Detection Benchmark. In IJCNN. 1--8.Google Scholar
- Tzu-Ming Harry Hsu, Hang Qi, and Matthew Brown. 2019. Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335 (2019).Google Scholar
- Yen-Chang Hsu, Yen-Cheng Liu, Anita Ramasamy, and Zsolt Kira. 2018. Re-evaluating continual learning scenarios: A categorization and case for strong baselines. arXiv preprint arXiv:1810.12488 (2018).Google Scholar
- Dapeng Hu, Shipeng Yan, Qizhengqiu Lu, HONG Lanqing, Hailin Hu, Yifan Zhang, Zhenguo Li, Xinchao Wang, and Jiashi Feng. 2022. How Well Does Self-Supervised Pre-Training Perform with Streaming Data?. In ICLR.Google Scholar
- Ching-Yi Hung, Cheng-Hao Tu, Cheng-En Wu, Chien-Hung Chen, Yi-Ming Chan, and Chu-Song Chen. 2019. Compacting, picking and growing for unforgetting continual learning. In NeurIPS, Vol. 32.Google Scholar
- David Isele and Akansel Cosgun. 2018. Selective experience replay for lifelong learning. In AAAI, Vol. 32.Google ScholarCross Ref
- Wonyong Jeong, Jaehong Yoon, Eunho Yang, and Sung Ju Hwang. 2021. Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint Learning. In ICLR.Google Scholar
- Xisen Jin, Arka Sadhu, Junyi Du, and Xiang Ren. 2021. Gradient-based editing of memory examples for online task-free continual learning. In NeurIPS, Vol. 34. 29193--29205.Google Scholar
- Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al. 2021. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning 14, 1--2 (2021), 1--210.Google Scholar
- Haeyong Kang, Rusty John Lloyd Mina, Sultan Rizky Hikmawan Madjid, Jaehong Yoon, Mark Hasegawa-Johnson, Sung Ju Hwang, and Chang D Yoo. 2022. Forget-free continual learning with winning subnetworks. In ICML. 10734--10750.Google Scholar
- Latif U Khan, Walid Saad, Zhu Han, Ekram Hossain, and Choong Seon Hong. 2021. Federated learning for internet of things: Recent advances, taxonomy, and open challenges. IEEE Communications Surveys & Tutorials 23, 3 (2021), 1759--1799.Google ScholarCross Ref
- Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. In NeurIPS, Vol. 33. 18661--18673.Google Scholar
- James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences 114, 13 (2017), 3521--3526.Google ScholarCross Ref
- Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Technical Report. Univ. Toronto, Toronto, Canada.Google Scholar
- Gihun Lee, Minchan Jeong, Yongjin Shin, Sangmin Bae, and Se-Young Yun. 2022. Preservation of the global knowledge by not-true distillation in federated learning. In NeurIPS.Google Scholar
- Junnan Li, Pan Zhou, Caiming Xiong, and Steven Hoi. 2021. Prototypical Contrastive Learning of Unsupervised Representations. In ICLR.Google Scholar
- Qinbin Li, Bingsheng He, and Dawn Song. 2021. Model-contrastive federated learning. In IEEE CVPR. 10713--10722.Google Scholar
- Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020. Federated optimization in heterogeneous networks. In MLSys. 429--450.Google Scholar
- Zhizhong Li and Derek Hoiem. 2017. Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence 40, 12 (2017), 2935--2947.Google Scholar
- Xinle Liang, Yang Liu, Tianjian Chen, Ming Liu, and Qiang Yang. 2022. Federated transfer reinforcement learning for autonomous driving. In Federated and Transfer Learning. Springer, 357--371.Google Scholar
- Ekdeep Lubana, Chi Ian Tang, Fahim Kawsar, Robert Dick, and Akhil Mathur. 2022. Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering. In ICML. 14461--14484.Google Scholar
- Yuhang Ma, Zhongle Xie, Jue Wang, Ke Chen, and Lidan Shou. 2022. Continual Federated Learning Based on Knowledge Distillation. In IJCAI. 2182--2188.Google Scholar
- Divyam Madaan, Jaehong Yoon, Yuanchun Li, Yunxin Liu, and Sung Ju Hwang. 2022. Representational Continuity for Unsupervised Continual Learning. In ICLR.Google Scholar
- Arun Mallya and Svetlana Lazebnik. 2018. Packnet: Adding multiple tasks to a single network by iterative pruning. In IEEE CVPR. 7765--7773.Google Scholar
- Pratik Mazumder, Pravendra Singh, and Piyush Rai. 2021. Few-shot lifelong learning. In AAAI, Vol. 35. 2337--2345.Google ScholarCross Ref
- Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics. 1273--1282.Google Scholar
- Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. 2011. Reading digits in natural images with unsupervised feature learning. (2011).Google Scholar
- Hong-Wei Ng and Stefan Winkler. 2014. A data-driven approach to cleaning large face datasets. In IEEE ICIP. 343--347.Google Scholar
- Xiaomin Ouyang, Zhiyuan Xie, Jiayu Zhou, Jianwei Huang, and Guoliang Xing. 2021. ClusterFL: a similarity-aware federated learning system for human activity recognition. In MobiSys. 54--66.Google Scholar
- Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H Lampert. 2017. icarl: Incremental classifier and representation learning. In IEEE CVPR. 2001--2010.Google Scholar
- David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy Lillicrap, and Gregory Wayne. 2019. Experience replay for continual learning. In NeurIPS, Vol. 32.Google Scholar
- Hanul Shin, Jung Kwon Lee, Jaehong Kim, and Jiwon Kim. 2017. Continual learning with deep generative replay. In NeurIPS, Vol. 30.Google ScholarDigital Library
- Jaemin Shin, Yuanchun Li, Yunxin Liu, and Sung-Ju Lee. 2022. FedBalancer: data and pace control for efficient federated learning on heterogeneous clients. In MobiSys. 436--449.Google Scholar
- Neta Shoham, Tomer Avidor, Aviv Keren, Nadav Israel, Daniel Benditkis, Liron Mor-Yosef, and Itai Zeitak. 2019. Overcoming forgetting in federated learning on non-iid data. arXiv preprint arXiv:1910.07796 (2019).Google Scholar
- Kihyuk Sohn, David Berthelot, Nicholas Carlini, Zizhao Zhang, Han Zhang, Colin A Raffel, Ekin Dogus Cubuk, Alexey Kurakin, and Chun-Liang Li. 2020. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. In NeurIPS, Vol. 33. 596--608.Google Scholar
- Linlin Tu, Xiaomin Ouyang, Jiayu Zhou, Yuze He, and Guoliang Xing. 2021. FedDL: Federated Learning via Dynamic Layer Sharing for Human Activity Recognition. In ACM SenSys. 15--28.Google Scholar
- Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008).Google Scholar
- Liyuan Wang, Kuo Yang, Chongxuan Li, Lanqing Hong, Zhenguo Li, and Jun Zhu. 2021. Ordisco: Effective and efficient usage of incremental unlabeled data for semi-supervised continual learning. In IEEE CVPR. 5383--5392.Google Scholar
- Zhuoyi Wang, Yuqiao Chen, Chen Zhao, Yu Lin, Xujiang Zhao, Hemeng Tao, Yigong Wang, and Latifur Khan. 2021. CLEAR: Contrastive-prototype learning with drift estimation for resource constrained stream mining. In WWW. 1351--1362.Google Scholar
- Yue Wu, Yinpeng Chen, Lijuan Wang, Yuancheng Ye, Zicheng Liu, Yandong Guo, and Yun Fu. 2019. Large scale incremental learning. In IEEE CVPR. 374--382.Google Scholar
- Ye Xiang, Ying Fu, Pan Ji, and Hua Huang. 2019. Incremental learning using conditional adversarial networks. In IEEE ICCV. 6619--6628.Google Scholar
- Chencheng Xu, Zhiwei Hong, Minlie Huang, and Tao Jiang. 2022. Acceleration of Federated Learning with Alleviated Forgetting in Local Training. In ICLR.Google Scholar
- Jie Xu, Benjamin S Glicksberg, Chang Su, Peter Walker, Jiang Bian, and Fei Wang. 2021. Federated learning for healthcare informatics. Journal of Healthcare Informatics Research 5 (2021), 1--19.Google ScholarCross Ref
- Shipeng Yan, Jiangwei Xie, and Xuming He. 2021. DER: Dynamically expandable representation for class incremental learning. In IEEE CVPR. 3014--3023.Google Scholar
- Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology 10, 2 (2019), 1--19.Google ScholarDigital Library
- Xin Yao and Lifeng Sun. 2020. Continual local training for better initialization of federated models. In IEEE ICIP. 1736--1740.Google Scholar
- Jaehong Yoon, Wonyong Jeong, Giwoong Lee, Eunho Yang, and Sung Ju Hwang. 2021. Federated continual learning with weighted inter-client transfer. In ICML. 12073--12086.Google Scholar
- Lu Yu, Bartlomiej Twardowski, Xialei Liu, Luis Herranz, Kai Wang, Yongmei Cheng, Shangling Jui, and Joost van de Weijer. 2020. Semantic drift compensation for class-incremental learning. In IEEE CVPR. 6982--6991.Google Scholar
- Friedemann Zenke, Ben Poole, and Surya Ganguli. 2017. Continual learning through synaptic intelligence. In ICML. 3987--3995.Google Scholar
- Lei Zhang, Guanyu Gao, and Huaizheng Zhang. 2022. Spatial-Temporal Federated Learning for Lifelong Person Re-identification on Distributed Edges. arXiv preprint arXiv:2207.11759 (2022).Google Scholar
- Fei Zhu, Xu-Yao Zhang, Chuang Wang, Fei Yin, and Cheng-Lin Liu. 2021. Prototype augmentation and self-supervision for incremental learning. In IEEE CVPR. 5871--5880.Google Scholar
- Kai Zhu, Yang Cao, Wei Zhai, Jie Cheng, and Zheng-Jun Zha. 2021. Self-promoted prototype refinement for few-shot class-incremental learning. In IEEE CVPR. 6801--6810.Google Scholar
- Weiming Zhuang, Xin Gan, Yonggang Wen, Shuai Zhang, and Shuai Yi. 2021. Collaborative unsupervised visual representation learning from decentralized data. In IEEE ICCV. 4912--4921.Google Scholar
- Weiming Zhuang, Yonggang Wen, and Shuai Zhang. 2022. Divergence-aware Federated Self-Supervised Learning. In ICLR.Google Scholar
Index Terms
- FedINC: An Exemplar-Free Continual Federated Learning Framework with Small Labeled Data
Recommendations
Non-IID data and Continual Learning processes in Federated Learning: A long road ahead
AbstractFederated Learning is a novel framework that allows multiple devices or institutions to train a machine learning model collaboratively while preserving their data private. This decentralized approach is prone to suffer the consequences ...
Highlights- We review the advances in Federated Learning, particularly on heterogeneous data.
Learning domain-heterogeneous speaker recognition systems with personalized continual federated learning
AbstractSpeaker recognition, the process of automatically identifying a speaker based on individual characteristics in speech signals, presents significant challenges when addressing heterogeneous-domain conditions. Federated learning, a recent ...
Unsupervised Continual Learning via Pseudo Labels
Continual Semi-Supervised LearningAbstractContinual learning aims to learn new tasks incrementally using less computation and memory resources instead of retraining the model from scratch whenever new task arrives. However, existing approaches are designed in supervised fashion assuming ...
Comments