skip to main content
10.1145/3580305.3599293acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Free Access

CriticalFL: A Critical Learning Periods Augmented Client Selection Framework for Efficient Federated Learning

Published:04 August 2023Publication History

ABSTRACT

Federated learning (FL) is a distributed optimization paradigm that learns from data samples distributed across a number of clients. Adaptive client selection that is cognizant of the training progress of clients has become a major trend to improve FL efficiency but not yet well-understood. Most existing FL methods such as FedAvg and its state-of-the-art variants implicitly assume that all learning phases during the FL training process are equally important. Unfortunately, this assumption has been revealed to be invalid due to recent findings on critical learning periods (CLP), in which small gradient errors may lead to an irrecoverable deficiency on final test accuracy. In this paper, we develop CriticalFL, a CLP augmented FL framework to reveal that adaptively augmenting exiting FL methods with CLP, the resultant performance is significantly improved when the client selection is guided by the discovered CLP. Experiments based on various machine learning models and datasets validate that the proposed CriticalFL framework consistently achieves an improved model accuracy while maintains better communication efficiency as compared to state-of-the-art methods, demonstrating a promising and easily adopted method for tackling the heterogeneity of FL training.

Skip Supplemental Material Section

Supplemental Material

rtfp0430-2min-promo.mp4

mp4

3.2 MB

References

  1. Reisizadeh A., Mokhtari A., and Hassani H. 2020. Fedpaq: A communicationefficient federated learning method with periodic averaging and quantization. In Proc. of AISTATS.Google ScholarGoogle Scholar
  2. Alessandro Achille, Matteo Rovere, and Stefano Soatto. 2019. Critical Learning Periods in Deep Networks. In Proc. of ICLR.Google ScholarGoogle Scholar
  3. Idan Achituve, Aviv Shamsian, Aviv Navon, Gal Chechik, and Ethan Fetaya. 2021. Personalized Federated Learning with Gaussian Processes. Proc. of NeurIPS (2021).Google ScholarGoogle Scholar
  4. Debraj Basu, Deepesh Data, Can Karakus, and Suhas Diggavi. 2019. Qsparse-local- SGD: Distributed SGD with Quantization, Sparsification and Local Computations. In Proc. of NeurIPS.Google ScholarGoogle Scholar
  5. Yae Jee Cho, Jianyu Wang, and Gauri Joshi. 2020. Client Selection in Federated Learning: Convergence Analysis and Power-of-Choice Selection Strategies. arXiv preprint arXiv:2010.01243 (2020).Google ScholarGoogle Scholar
  6. Yae Jee Cho, Jianyu Wang, and Gauri Joshi. 2022. Towards Understanding Biased Client Selection in Federated Learning. In Proc. of AISTATS.Google ScholarGoogle Scholar
  7. Jonathan Frankle, David J Schwab, and Ari S Morcos. 2020. The Early Phase of Neural Network Training. In Proc. of ICLR.Google ScholarGoogle Scholar
  8. Aditya Sharad Golatkar, Alessandro Achille, and Stefano Soatto. 2019. Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence. Proc. of NeurIPS (2019).Google ScholarGoogle Scholar
  9. Farzin Haddadpour, Mohammad Mahdi Kamani, Aryan Mokhtari, and Mehrdad Mahdavi. 2021. Federated Learning with Compression: Unified Analysis and Sharp Guarantees. In Proc. of AISTATS.Google ScholarGoogle Scholar
  10. Jenny Hamer, Mehryar Mohri, and Ananda Theertha Suresh. 2020. FedBoost: A Communication-Efficient Algorithm for Federated Learning. In Proc. of ICML.Google ScholarGoogle Scholar
  11. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proc. of IEEE CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  12. Samuel Horváth and Peter Richtarik. 2021. A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning. In Proc. of ICLR.Google ScholarGoogle Scholar
  13. Ahmed Imteaj, Khandaker Mamun Ahmed, Urmish Thakker, Shiqiang Wang, Jian Li, and M Hadi Amini. 2022. Federated Learning for Resource-Constrained IoT Devices: Panoramas and State of the Art. Federated and Transfer Learning (2022), 7--27.Google ScholarGoogle Scholar
  14. Ahmed Imteaj, Urmish Thakker, Shiqiang Wang, Jian Li, and M Hadi Amini. 2021. A survey on federated learning for resource-constrained IoT devices. IEEE Internet of Things Journal 9, 1 (2021), 1--24.Google ScholarGoogle ScholarCross RefCross Ref
  15. Stanislaw Jastrzebski, Devansh Arpit, Oliver Astrand, Giancarlo B Kerg, Huan Wang, Caiming Xiong, Richard Socher, Kyunghyun Cho, and Krzysztof J Geras. 2021. Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization. In Proc. of ICML.Google ScholarGoogle Scholar
  16. Stanislaw Jastrzebski, Zachary Kenton, Nicolas Ballas, Asja Fischer, Yoshua Bengio, and Amos J Storkey. 2019. On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length. In Proc. of ICLR.Google ScholarGoogle Scholar
  17. Stanislaw Jastrzebski, Maciej Szymczak, Stanislav Fort, Devansh Arpit, Jacek Tabor, Kyunghyun Cho, and Krzysztof Geras. 2020. The Break-Even Point on Optimization Trajectories of Deep Neural Networks. In Proc. of ICLR.Google ScholarGoogle Scholar
  18. Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al. 2019. Advances and Open Problems in Federated Learning. arXiv preprint arXiv:1912.04977 (2019).Google ScholarGoogle Scholar
  19. Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian Stich, and Ananda Theertha Suresh. 2020. SCAFFOLD: Stochastic Controlled Averaging for Federated Learning. In Proc. of ICML.Google ScholarGoogle Scholar
  20. Angelos Katharopoulos and François Fleuret. 2018. Not All Samples Are Created Equal: Deep Learning with Importance Sampling. In Proc. of ICML.Google ScholarGoogle Scholar
  21. Ahmed Khaled, Konstantin Mishchenko, and Peter Richtárik. 2020. Tighter theory for local SGD on identical and heterogeneous data. In Proc. of AISTATS.Google ScholarGoogle Scholar
  22. Yoon Kim, Yacine Jernite, David Sontag, and Alexander M Rush. 2016. Characteraware neural language models. In Proc. of AAAI.Google ScholarGoogle Scholar
  23. Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning Multiple Layers of Features from Tiny Images. (2009).Google ScholarGoogle Scholar
  24. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet Classification with Deep Convolutional Neural Networks. Proc. of NIPS (2012).Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Fan Lai, Xiangfeng Zhu, Harsha V Madhyastha, and Mosharaf Chowdhury. 2021. Oort: Efficient Federated Learning via Guided Participant Selection. In Proc. of USENIX OSDI.Google ScholarGoogle Scholar
  26. Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020. Federated Optimization in Heterogeneous Networks. In Proc. of MLSys.Google ScholarGoogle Scholar
  27. Xiang Li, Kaixuan Huang, Wenhao Yang, Shusen Wang, and Zhihua Zhang. 2020. On the Convergence of FedAvg on Non-IID Data. In Proc. of ICLR.Google ScholarGoogle Scholar
  28. Xianfeng Liang, Shuheng Shen, Jingchang Liu, Zhen Pan, Enhong Chen, and Yifei Cheng. 2019. Variance Reduced Local SGD with Lower Communication Complexity. arXiv preprint arXiv:1912.12844 (2019).Google ScholarGoogle Scholar
  29. Tao Lin, Lingjing Kong, Sebastian U Stich, and Martin Jaggi. 2020. Ensemble distillation for robust model fusion in federated learning. Proc. of NeurIPS (2020).Google ScholarGoogle Scholar
  30. Amiri M. M., Gunduz D., and Kulkarni S. R. 2020. Federated learning with quantized global model updates. arXiv preprint arXiv:2006.10672 (2020).Google ScholarGoogle Scholar
  31. Grigory Malinovskiy, Dmitry Kovalev, Elnur Gasanov, Laurent Condat, and Peter Richtarik. 2020. From local SGD to local fixed-point methods for federated learning. In Proc. of ICML.Google ScholarGoogle Scholar
  32. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proc. of AISTATS.Google ScholarGoogle Scholar
  33. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. In NIPS-W.Google ScholarGoogle Scholar
  34. Reese Pathak and Martin JWainwright. 2020. FedSplit: An algorithmic framework for fast federated optimization. Proc. of NeurIPS (2020).Google ScholarGoogle Scholar
  35. Hönig R., Zhao Y., and Mullins R. 2022. DAdaQuant: Doubly-adaptive quantization for communication-efficient Federated Learning. In Proc. of ICML.Google ScholarGoogle Scholar
  36. Sashank J. Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, and Hugh Brendan McMahan. 2021. Adaptive Federated Optimization. In Proc. of ICLR.Google ScholarGoogle Scholar
  37. Monica Ribero and Haris Vikalo. 2020. Communication-Efficient Federated Learning via Optimal Client Sampling. arXiv preprint arXiv:2007.15197 (2020).Google ScholarGoogle Scholar
  38. Daniel Rothchild, Ashwinee Panda, Enayat Ullah, Nikita Ivkin, Ion Stoica, Vladimir Braverman, Joseph Gonzalez, and Raman Arora. 2020. FetchSGD: Communication-Efficient Federated Learning with Sketching. In Proc. of ICML.Google ScholarGoogle Scholar
  39. Yichen Ruan, Xiaoxi Zhang, Shu-Che Liang, and Carlee Joe-Wong. 2021. Towards Flexible Device Participation in Federated Learning. In Proc. of AISTATS.Google ScholarGoogle Scholar
  40. Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-scale Image Recognition. In Proc. of ICLR.Google ScholarGoogle Scholar
  41. Sebastian U Stich and Sai Praneeth Karimireddy. 2020. The error-feedback framework: Better rates for sgd with delayed gradients and compressed updates. Journal of Machine Learning Research 21 (2020), 1--36.Google ScholarGoogle Scholar
  42. Minxue Tang, Xuefei Ning, Yitu Wang, Jingwei Sun, Yu Wang, Hai Li, and Yiran Chen. 2022. FedCor: Correlation-Based Active Client Selection Strategy for Heterogeneous Federated Learning. In Proc. of IEEE/CVF CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  43. Hao Wang, Zakhary Kaplan, Di Niu, and Baochun Li. 2020. Optimizing Federated Learning on Non-IID Data With Reinforcement Learning. In Proc. of IEEE INFOCOM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris Papailiopoulos, and Yasaman Khazaeni. 2020. Federated Learning with Matched Averaging. In Proc. of ICLR.Google ScholarGoogle Scholar
  45. Jianyu Wang and Gauri Joshi. 2019. Adaptive Communication Strategies to Achieve the Best Error-Runtime Trade-off in Local-update SGD. In Proc. of SysML.Google ScholarGoogle Scholar
  46. Jianyu Wang and Gauri Joshi. 2021. Cooperative SGD: A Unified Framework for the Design and Analysis of Local-Update SGD Algorithms. Journal of Machine Learning Research 22, 213 (2021), 1--50.Google ScholarGoogle Scholar
  47. Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, and H Vincent Poor. 2020. Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization. Proc. of NeurIPS (2020).Google ScholarGoogle Scholar
  48. Jianqiao Wangni, Jialei Wang, Ji Liu, and Tong Zhang. 2018. Gradient sparsification for communication-efficient distributed optimization. Advances in Neural Information Processing Systems 31 (2018).Google ScholarGoogle Scholar
  49. Blake Woodworth, Kumar Kshitij Patel, Sebastian Stich, Zhen Dai, Brian Bullins, Brendan Mcmahan, Ohad Shamir, and Nathan Srebro. 2020. Is local SGD better than minibatch SGD?. In Proc. of ICML.Google ScholarGoogle Scholar
  50. Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv preprint arXiv:1708.07747 (2017).Google ScholarGoogle Scholar
  51. Guojun Xiong, Gang Yan, Rahul Singh, and Jian Li. 2021. Straggler-resilient distributed machine learning with dynamic backup workers. arXiv preprint arXiv:2102.06280 (2021).Google ScholarGoogle Scholar
  52. Gang Yan, Hao Wang, and Jian Li. 2022. Seizing Critical Learning Periods in Federated Learning. In Proc. of AAAI.Google ScholarGoogle ScholarCross RefCross Ref
  53. Gang Yan, Hao Wang, Xu Yuan, and Jian Li. 2023. CriticalFL: A Critical Learning Periods Augmented Client Selection Framework for Efficient Federated Learning. (2023). https://www.dropbox.com/s/m501qs0pppmgu9y/main.pdf?dl=0Google ScholarGoogle Scholar
  54. Gang Yan, Hao Wang, Xu Yuan, and Jian Li. 2023. DeFL: Defending Against Model Poisoning Attacks in Federated Learning via Critical Learning Periods Awareness. In Proc. of AAAI.Google ScholarGoogle ScholarCross RefCross Ref
  55. Zezhang Yang, Jian Li, and Ping Yang. 2021. Fedadmp: A joint anomaly detection and mobility prediction framework via federated learning. ICST Transactions on Security and Safety 8, 29 (2021).Google ScholarGoogle ScholarCross RefCross Ref
  56. Hao Yu, Sen Yang, and Shenghuo Zhu. 2019. Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning. In Proc. of AAAI.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. CriticalFL: A Critical Learning Periods Augmented Client Selection Framework for Efficient Federated Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
      August 2023
      5996 pages
      ISBN:9798400701030
      DOI:10.1145/3580305

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 August 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24
    • Article Metrics

      • Downloads (Last 12 months)528
      • Downloads (Last 6 weeks)66

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader