skip to main content
research-article

Addressing Heterogeneity in Federated Learning with Client Selection via Submodular Optimization

Published: 16 February 2024 Publication History

Abstract

Federated learning (FL) has been proposed as a privacy-preserving distributed learning paradigm, which differs from traditional distributed learning in two main aspects: the systems heterogeneity, meaning that clients participating in training have significant differences in systems performance including CPU frequency, dataset size, and transmission power, and the statistical heterogeneity, indicating that the data distribution among clients exhibits Non-Independent Identical Distribution. Therefore, the random selection of clients will significantly reduce the training efficiency of FL. In this article, we propose a client selection mechanism considering both systems and statistical heterogeneity, which aims to improve the time-to-accuracy performance by trading off the impact of systems performance differences and data distribution differences among the clients on training efficiency. First, client selection is formulated as a combinatorial optimization problem that jointly optimizes systems and statistical performance. Then, we generalize it to a submodular maximization problem with knapsack constraint, and propose the Iterative Greedy with Partial Enumeration (IGPE) algorithm to greedily select the suitable clients. Then, the approximation ratio of IGPE is analyzed theoretically. Extensive experiments verify that the time-to-accuracy performance of the IGPE algorithm outperforms other compared algorithms in a variety of heterogeneous environments.

References

[1]
Haftay Gebreslasie Abreha, Mohammad Hayajneh, and Mohamed Adel Serhani. 2022. Federated learning in edge computing: A systematic survey. Sensors 22, 2 (2022), 450.
[2]
Mohammad Mohammadi Amiri, Deniz Gündüz, Sanjeev R. Kulkarni, and H. Vincent Poor. 2021. Convergence of update aware device scheduling for federated learning at the wireless edge. IEEE Trans. Wireless Commun. 20, 6 (2021), 3643–3658.
[3]
Tran The Anh, Nguyen Cong Luong, Dusit Niyato, Dong In Kim, and Li-Chun Wang. 2019. Efficient training management for mobile crowd-machine learning: A deep reinforcement learning approach. IEEE Wireless Commun. Lett. 8, 5 (2019), 1345–1348.
[4]
Jinheon Baek, Wonyong Jeong, Jiongdao Jin, Jaehong Yoon, and Sung Ju Hwang. 2022. Personalized subgraph federated learning. Retrieved from https://arXiv:2206.10206
[5]
Ravikumar Balakrishnan, Tian Li, Tianyi Zhou, Nageen Himayat, Virginia Smith, and Jeff Bilmes. 2021. Diverse client selection for federated learning via submodular maximization. In Proceedings of the International Conference on Learning Representations.
[6]
Hangrui Cao, Qiying Pan, Yifei Zhu, and Jiangchuan Liu. 2022. Birds of a feather help: Context-aware client selection for federated learning. In Proceedings of the International Workshop on Trustable, Verifiable, and Auditable Federated Learning in Conjunction with AAAI (FL-AAAI’22).
[7]
Chen Chen, Hong Xu, Wei Wang, Baochun Li, Bo Li, Li Chen, and Gong Zhang. 2021. Communication-efficient federated learning with adaptive parameter freezing. In Proceedings of the IEEE 41st International Conference on Distributed Computing Systems (ICDCS’21). IEEE, 1–11.
[8]
Rui Chen, Liang Li, Kaiping Xue, Chi Zhang, Miao Pan, and Yuguang Fang. 2022. Energy efficient federated learning over heterogeneous mobile devices via joint design of weight quantization and wireless transmission. IEEE Transactions on Mobile Computing 22, 12 (2023), 7451–7465.
[9]
Wenlin Chen, Samuel Horvath, and Peter Richtarik. 2020. Optimal client sampling for federated learning. Retrieved from https://arXiv:2010.13723
[10]
Yae Jee Cho, Jianyu Wang, and Gauri Joshi. 2020. Client selection in federated learning: Convergence analysis and power-of-choice selection strategies. Retrieved from https://arxiv.org/abs/2010.01243
[11]
Gregory Cohen, Saeed Afshar, Jonathan Tapson, and André van Schaik. 2017. EMNIST: An extension of MNIST to handwritten letters. In Retrieved from https://arxiv.org/abs/1702.05373
[12]
Bart Custers, Alan M. Sears, Francien Dechesne, Ilina Georgieva, Tommaso Tani, and Simone Van der Hof. 2019. EU Personal Data Protection in Policy and Practice. Vol. 29. Springer.
[13]
Don Kurian Dennis, Tian Li, and Virginia Smith. 2021. Heterogeneity for the win: One-shot federated clustering. In Proceedings of the International Conference on Machine Learning. PMLR, 2611–2620.
[14]
Uriel Feige. 1998. A threshold of ln n for approximating set cover. J. ACM 45, 4 (1998), 634–652.
[15]
Uriel Feige, Vahab S. Mirrokni, and Jan Vondrák. 2011. Maximizing non-monotone submodular functions. SIAM J. Comput. 40, 4 (2011), 1133–1153.
[16]
Yann Fraboni, Richard Vidal, Laetitia Kameni, and Marco Lorenzi. 2021. Clustered sampling: Low-variance and improved representativity for clients selection in federated learning. In Proceedings of the International Conference on Machine Learning. PMLR, 3407–3416.
[17]
Lei Fu, Huanle Zhang, Ge Gao, Huajie Wang, Mi Zhang, and Xin Liu. 2022. Client selection in federated learning: Principles, challenges, and opportunities. Retrieved from https://arXiv:2211.01549
[18]
Xinran Gu, Kaixuan Huang, Jingzhao Zhang, and Longbo Huang. 2021. Fast federated learning in the presence of arbitrary device unavailability. Adv. Neural Info. Process. Syst. 34 (2021), 12052–12064.
[19]
Andrew Hard, Kanishka Rao, Rajiv Mathews, Swaroop Ramaswamy, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. 2018. Federated learning for mobile keyboard prediction. Retrieved from https://arXiv:1811.03604
[20]
Thibaut Horel. 2015. Notes on greedy algorithms for submodular maximization. Lecture Notes, Available at https://thibaut.horel.org/submodularity/notes/02-12.pdf
[21]
Kevin Hsieh, Aaron Harlap, Nandita Vijaykumar, Dimitris Konomis, Gregory R. Ganger, Phillip B. Gibbons, and Onur Mutlu. 2017. Gaia: Geo-distributed machine learning approaching LAN speeds. In Proceedings of the Usenix Conference on Networked Systems Design and Implementation (NSDI’17). 629–647.
[22]
Yuang Jiang, Shiqiang Wang, Victor Valls, Bong Jun Ko, Wei-Han Lee, Kin K. Leung, and Leandros Tassiulas. 2022. Model pruning enables efficient federated learning on edge devices. IEEE Transactions on Neural Networks and Learning Systems 34, 12 (2023), 10374–10386.
[23]
Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings et al. 2021. Advances and open problems in federated learning. Found. Trends Mach. Learn. 14, 1–2 (2021), 1–210.
[24]
Latif U. Khan, Shashi Raj Pandey, Nguyen H. Tran, Walid Saad, Zhu Han, Minh N. H. Nguyen, and Choong Seon Hong. 2020. Federated learning for edge networks: Resource optimization and incentive mechanism. IEEE Commun. Mag. 58, 10 (2020), 88–93.
[25]
Fan Lai, Yinwei Dai, Sanjay Singapuram, Jiachen Liu, Xiangfeng Zhu, Harsha Madhyastha, and Mosharaf Chowdhury. 2022. Fedscale: Benchmarking model and system performance of federated learning at scale. In Proceedings of the International Conference on Machine Learning. PMLR, 11814–11827.
[26]
Fan Lai, Xiangfeng Zhu, Harsha V. Madhyastha, and Mosharaf Chowdhury. 2021. Oort: Efficient federated learning via guided participant selection. In Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI’21). 19–35.
[27]
Jon Lee, Vahab S. Mirrokni, Viswanath Nagarajan, and Maxim Sviridenko. 2010. Maximizing nonmonotone submodular functions under matroid or knapsack constraints. SIAM J. Discrete Math. 23, 4 (2010), 2053–2078.
[28]
Sunwoo Lee, Tuo Zhang, Chaoyang He, and Salman Avestimehr. 2021. Layer-wise adaptive model aggregation for scalable federated learning. Retrieved from https://arxiv.org/abs/2110.10302
[29]
Khaled B. Letaief, Wei Chen, Yuanming Shi, Jun Zhang, and Ying-Jun Angela Zhang. 2019. The roadmap to 6G: AI empowered wireless networks. IEEE Commun. Mag. 57, 8 (2019), 84–90.
[30]
Ang Li, Jingwei Sun, Pengcheng Li, Yu Pu, Hai Li, and Yiran Chen. 2021. Hermes: An efficient federated learning framework for heterogeneous mobile clients. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking. 420–437.
[31]
Chenning Li, Xiao Zeng, Mi Zhang, and Zhichao Cao. 2022. PyramidFL: A fine-grained client selection framework for efficient federated learning. In Proceedings of the 28th Annual International Conference on Mobile Computing And Networking. 158–171.
[32]
Tian Li, Shengyuan Hu, Ahmad Beirami, and Virginia Smith. 2021. Ditto: Fair and robust federated learning through personalization. In Proceedings of the International Conference on Machine Learning. PMLR, 6357–6368.
[33]
Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020. Federated optimization in heterogeneous networks. Proc. Mach. Learn. Syst. 2 (2020), 429–450.
[34]
Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smithy. 2019. Feddane: A federated newton-type method. In Proceedings of the 53rd Asilomar Conference on Signals, Systems, and Computers. IEEE, 1227–1231.
[35]
Hui Lin and Jeff Bilmes. 2011. A class of submodular functions for document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 510–520.
[36]
Lumin Liu, Jun Zhang, SH Song, and Khaled B. Letaief. 2020. Client-edge-cloud hierarchical federated learning. In Proceedings of the IEEE International Conference on Communications (ICC’20). IEEE, 1–6.
[37]
Xuanzhe Liu, Huoran Li, Xuan Lu, Tao Xie, Qiaozhu Mei, Feng Feng, and Hong Mei. 2017. Understanding diverse usage patterns from large-scale appstore-service profiles. IEEE Trans. Softw. Eng. 44, 4 (2017), 384–411.
[38]
Bing Luo, Xiang Li, Shiqiang Wang, Jianwei Huang, and Leandros Tassiulas. 2021. Cost-effective federated learning in mobile edge networks. IEEE J. Select. Areas Commun. 39, 12 (2021), 3606–3621.
[39]
Bing Luo, Wenli Xiao, Shiqiang Wang, Jianwei Huang, and Leandros Tassiulas. 2022. Tackling system and statistical heterogeneity for federated learning with adaptive client sampling. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM’22). IEEE, 1739–1748.
[40]
Wang Luping, Wang Wei, and Li Bo. 2019. CMFL: Mitigating communication overhead for federated learning. In Proceedings of the IEEE 39th International Conference on Distributed Computing Systems (ICDCS’19). IEEE, 954–964.
[41]
Zhenguo Ma, Yang Xu, Hongli Xu, Zeyu Meng, Liusheng Huang, and Yinxing Xue. 2021. Adaptive batch size for federated learning in resource-constrained edge computing. IEEE Trans. Mobile Comput 22, 1 (2023), 37–53.
[42]
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics. PMLR, 1273–1282.
[43]
Lokesh Nagalapatti and Ramasuri Narayanam. 2021. Game of gradients: Mitigating irrelevant clients in federated learning. In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI’21), 33rd Conference on Innovative Applications of Artificial Intelligence (IAAI’21), and the 11th Symposium on Educational Advances in Artificial Intelligence (EAAI’21). AAAI Press, 9046–9054.
[44]
Dinh C. Nguyen, Ming Ding, Quoc-Viet Pham, Pubudu N. Pathirana, Long Bao Le, Aruna Seneviratne, Jun Li, Dusit Niyato, and H. Vincent Poor. 2021. Federated learning meets blockchain in edge computing: Opportunities and challenges. IEEE Internet of Things Journal 8, 16 (2021), 12806–12825.
[45]
Huy T. Nguyen, Nguyen Cong Luong, Jun Zhao, Chau Yuen, and Dusit Niyato. 2020. Resource allocation in mobility-aware federated learning networks: A deep reinforcement learning approach. In Proceedings of the IEEE 6th World Forum on Internet of Things (WF-IoT’20). IEEE, 1–6.
[46]
Hung T. Nguyen, Vikash Sehwag, Seyyedali Hosseinalipour, Christopher G. Brinton, Mung Chiang, and H. Vincent Poor. 2020. Fast-convergent federated learning. IEEE J. Select. Areas Commun. 39, 1 (2020), 201–218.
[47]
Takayuki Nishio and Ryo Yonetani. 2019. Client selection for federated learning with heterogeneous resources in mobile edge. In Proceedings of the IEEE International Conference on Communications (ICC’19). IEEE, 1–7.
[48]
Elsa Rizk, Stefan Vlaski, and Ali H. Sayed. 2021. Optimal importance sampling for federated learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’21). IEEE, 3095–3099.
[49]
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4510–4520.
[50]
Victor Sanh, Thomas Wolf, and Alexander Rush. 2020. Movement pruning: Adaptive sparsity by fine-tuning. Adv. Neural Info. Process. Syst. 33 (2020), 20378–20389.
[51]
Felix Sattler, Klaus-Robert Müller, and Wojciech Samek. 2020. Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints. IEEE Trans. Neural Netw. Learn. Syst. 32, 8 (2020), 3710–3722.
[52]
Jaemin Shin, Yuanchun Li, Yunxin Liu, and Sung-Ju Lee. 2022. FedBalancer: Data and pace control for efficient federated learning on heterogeneous clients. In Proceedings of the 20th Annual International Conference on Mobile Systems, Applications and Services. 436–449.
[53]
Nir Shlezinger, Mingzhe Chen, Yonina C. Eldar, H. Vincent Poor, and Shuguang Cui. 2020. Federated learning with quantization constraints. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’20). IEEE, 8851–8855.
[54]
Nir Shlezinger, Mingzhe Chen, Yonina C. Eldar, H. Vincent Poor, and Shuguang Cui. 2020. UVeQFed: Universal vector quantization for federated learning. IEEE Trans. Signal Process. 69 (2020), 500–514.
[55]
Ian Simon, Noah Snavely, and Steven M. Seitz. 2007. Scene summarization for online image collections. In Proceedings of the IEEE 11th International Conference on Computer Vision. IEEE, 1–8.
[56]
Abeda Sultana, Md Mainul Haque, Li Chen, Fei Xu, and Xu Yuan. 2022. Eiffel: Efficient and fair scheduling in adaptive federated learning. IEEE Trans. Parallel Distrib. Syst. 33, 12 (2022), 4282–4294.
[57]
Shuo Wan, Jiaxun Lu, Pingyi Fan, Yunfeng Shao, Chenghui Peng, and Khaled B. Letaief. 2021. Convergence analysis and system design for federated learning over wireless networks. IEEE J. Select. Areas Commun. 39, 12 (2021), 3622–3639.
[58]
Hao Wang, Zakhary Kaplan, Di Niu, and Baochun Li. 2020. Optimizing federated learning on non-iid data with reinforcement learning. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM’20). IEEE, 1698–1707.
[59]
Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, and H. Vincent Poor. 2020. Tackling the objective inconsistency problem in heterogeneous federated optimization. Adv. Neural Info. Process. Syst. 33 (2020), 7611–7623.
[60]
William Yu Chung Wang and Yichuan Wang. 2020. Analytics in the era of big data: The digital transformations and value creation in industrial marketing. Industrial Marketing Management 86 (2020), 12–15.
[61]
Zheng Wang, Xiaoliang Fan, Jianzhong Qi, Haibing Jin, Peizhen Yang, Siqi Shen, and Cheng Wang. 2022. FedGS: Federated graph-based sampling with arbitrary client availability. Retrieved from https://arxiv.org/abs/2211.13975
[62]
Hongda Wu and Ping Wang. 2022. Node selection toward faster convergence for federated learning on non-IID data. IEEE Trans. Netw. Sci. Eng. 9, 5 (2022), 3099–3111.
[63]
Jie Xu and Heqiang Wang. 2020. Client selection and bandwidth allocation in wireless federated learning networks: A long-term perspective. IEEE Trans. Wireless Commun. 20, 2 (2020), 1188–1200.
[64]
Lintao Ye and Vijay Gupta. 2021. Client scheduling for federated learning over wireless networks: A submodular optimization approach. In Proceedings of the 60th IEEE Conference on Decision and Control (CDC’21). 63–68.
[65]
Liangkun Yu, Rana Albelaihi, Xiang Sun, Nirwan Ansari, and Michael Devetsikiotis. 2021. Jointly optimizing client selection and resource management in wireless federated learning for internet of things. IEEE Internet of Things Journal 9, 6 (2022), 4385–4395.
[66]
Xiao Zeng, Ming Yan, and Mi Zhang. 2021. Mercury: Efficient on-device distributed dnn training via stochastic importance sampling. In Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems. 29–41.
[67]
Jinghui Zhang, Xinyu Cheng, Cheng Wang, Yuchen Wang, Zhan Shi, Jiahui Jin, Aibo Song, Wei Zhao, Liangsheng Wen, and Tingting Zhang. 2022. FedAda: Fast-convergent adaptive federated learning in heterogeneous mobile edge computing environment. In Proceedings of the World Wide Web Conference. 1–28.
[68]
Ruisheng Zhang, Yansheng Wang, Zimu Zhou, Ziyao Ren, Yongxin Tong, and Ke Xu. 2022. Data source selection in federated learning: A submodular optimization approach. In Proceedings of the International Conference on Database Systems for Advanced Applications. Springer, 606–614.
[69]
Jianxin Zhao, Xinyu Chang, Yanhao Feng, Chi Harold Liu, and Ningbo Liu. 2022. Participant selection for federated learning with heterogeneous data in intelligent transport system. IEEE Transactions on Intelligent Transportation Systems 24, 1 (2023), 1106–1115.
[70]
Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, and Vikas Chandra. 2018. Federated learning with non-IID data. Retrieved from https://arxiv.org/abs/1806.00582
[71]
Chendi Zhou, Ji Liu, Juncheng Jia, Jingbo Zhou, Yang Zhou, Huaiyu Dai, and Dejing Dou. 2021. Efficient device scheduling with multi-job federated learning. Retrieved from https://arxiv.org/abs/2112.05928
[72]
Pengyuan Zhou, Hengwei Xu, Lik Hang Lee, Pei Fang, and Pan Hui. 2022. Are you left out? An efficient and fair federated learning for personalized profiles on wearable devices of inferior networking conditions. Proc. ACM Interact., Mobile, Wear. Ubiq. Technol. 6, 2 (2022), 1–25.

Cited By

View all
  • (2025)FairDPFL-SCS: Fair Dynamic Personalized Federated Learning with strategic client selection for improved accuracy and fairnessInformation Fusion10.1016/j.inffus.2024.102756115(102756)Online publication date: Mar-2025
  • (2025)Advancing elderly social care dropout prediction with federated learning: client selection and imbalanced data managementCluster Computing10.1007/s10586-024-04850-428:2Online publication date: 1-Apr-2025
  • (2024)Addressing Bias and Fairness Using Fair Federated Learning: A Synthetic ReviewElectronics10.3390/electronics1323466413:23(4664)Online publication date: 26-Nov-2024
  • Show More Cited By

Index Terms

  1. Addressing Heterogeneity in Federated Learning with Client Selection via Submodular Optimization

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Sensor Networks
      ACM Transactions on Sensor Networks  Volume 20, Issue 2
      March 2024
      572 pages
      EISSN:1550-4867
      DOI:10.1145/3618080
      • Editor:
      • Wen Hu
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Journal Family

      Publication History

      Published: 16 February 2024
      Online AM: 22 December 2023
      Accepted: 15 December 2023
      Revised: 21 November 2023
      Received: 04 September 2023
      Published in TOSN Volume 20, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Federated learning
      2. client selection
      3. systems heterogeneity
      4. statistical heterogeneity
      5. submodular functions

      Qualifiers

      • Research-article

      Funding Sources

      • National Natural Science Foundation of China
      • Jiangsu Provincial Key Laboratory of Network and Information Security
      • Key Project of Natural Science Research in Jiangsu Provincial Colleges and Universities
      • Key Laboratory of Computer Network and Information Integration of Ministry of Education of China
      • Collaborative Innovation Center of Novel Software Technology and Industrialization, the Fundamental Research Funds for the Central Universities, CCF-Baidu Open Fund
      • Future Network Scientific Research Fund Project

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)497
      • Downloads (Last 6 weeks)57
      Reflects downloads up to 03 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)FairDPFL-SCS: Fair Dynamic Personalized Federated Learning with strategic client selection for improved accuracy and fairnessInformation Fusion10.1016/j.inffus.2024.102756115(102756)Online publication date: Mar-2025
      • (2025)Advancing elderly social care dropout prediction with federated learning: client selection and imbalanced data managementCluster Computing10.1007/s10586-024-04850-428:2Online publication date: 1-Apr-2025
      • (2024)Addressing Bias and Fairness Using Fair Federated Learning: A Synthetic ReviewElectronics10.3390/electronics1323466413:23(4664)Online publication date: 26-Nov-2024
      • (2024)Performance Profiling of Federated Learning Across Heterogeneous Mobile Devices2024 IEEE 24th International Conference on Software Quality, Reliability, and Security Companion (QRS-C)10.1109/QRS-C63300.2024.00053(363-372)Online publication date: 1-Jul-2024
      • (2024)Equitable Client Selection in Federated Learning via Truncated Submodular Maximization2024 IEEE 63rd Conference on Decision and Control (CDC)10.1109/CDC56724.2024.10886563(5496-5502)Online publication date: 16-Dec-2024
      • (2024)TSFed: A three-stage optimization mechanism for secure and efficient federated learning in industrial IoT networksInternet of Things10.1016/j.iot.2024.10128727(101287)Online publication date: Oct-2024

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media