Dynamic Clustering Federated Learning for Non-IID Data

Chen, Ming; Wu, Jinze; Yin, Yu; Huang, Zhenya; Liu, Qi; Chen, Enhong

doi:10.1007/978-3-031-20503-3_10

Ming Chen^12,13,
Jinze Wu^12,13,
Yu Yin^12,13,
Zhenya Huang^12,13,
Qi Liu^12,13 &
…
Enhong Chen^12,13

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13606))

Included in the following conference series:

CAAI International Conference on Artificial Intelligence

2165 Accesses

Abstract

Federated learning (FL) aims to raise a multi-client collaboration approach in the case of local data isolation. In particular, the clients with non-IID data frequently participate in or leave the federated learning training process asynchronously, resulting in dynamic federated learning (DFL) scenario, which attracts more and more attention. Indeed, an effective DFL solution has to address the following two challenges: 1) Statistical Dynamics. The distributions of local data from clients are always non-IID and the global data distribution is dynamic due to the participation or departure of clients. 2) Expiration Dynamics. After clients leave the federated training process, their historical updated models have a certain validity to reuse in subsequent rounds but it is hard to quantify this validity. In this paper, we first consider clustering the clients with similar data distribution to make them much closer to IID and concentrating on the training the models in each cluster. Then we analyze the changing trend of model validity named model quality and define one suitable function to describe expiration dynamics. As a solution, we propose Dynamic Clustering Federated Learning (DCFL) framework to improve federated learning on non-IID data in DFL. Specifically, DCFL follows the client-server architecture as the standard FL. On the client side, the local devices calculate the related information of the local data distribution for client clustering. On the server side, we design two strategies for the challenges above. We propose dynamic clustering aggregation strategy (including a dynamic clustering algorithm and a two-stage aggregation) by dynamically clustering clients and then aggregating the local models to overcome Statistical Dynamics. Besides, we propose expiration memory strategy by reusing the historical models and then adjusting to model quality of historical models as the basis for model aggregation to overcome Expiration Dynamics. Finally, we conduct extensive experiments on public datasets, which demonstrate the effectiveness of the DCFL framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multi-initial-Center Federated Learning with Data Distribution Similarity-Aware Constraint

Federated two-stage decoupling with adaptive personalization layers

Article Open access 15 February 2024

A Communication-Concerned Federated Learning Framework Based on Clustering Selection

Notes

References

Cao, X., Jia, J., Gong, N.Z.: Provably secure federated learning against malicious clients. In: Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, 2–9 Feb 2021, pp. 6885–6893 (2021)
Google Scholar
Chai, Z., Chen, Y., Anwar, A., Zhao, L., Cheng, Y., Rangwala, H.: Fedat: a high-performance and communication-efficient federated learning system with asynchronous tiers. In: SC ’21: The International Conference for High Performance Computing, Networking, Storage and Analysis, 14–19 Nov 2021, pp. 60:1–60:16. St. Louis, Missouri, USA (2021)
Google Scholar
Chen, M., Shlezinger, N., Poor, H.V., Eldar, Y.C., Cui, S.: Communication-efficient federated learning. Proc. Natl. Acad. Sci. 118(17) (2021)
Google Scholar
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 2, 224–227 (1979)
Article Google Scholar
Day, W.H., Edelsbrunner, H.: Efficient algorithms for agglomerative hierarchical clustering methods. J. Classif. 1(1), 7–24 (1984)
Article MATH Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: Density-based spatial clustering of applications with noise. In: International Conference Knowledge Discovery and Data Mining, vol. 240, p. 6 (1996)
Google Scholar
Gao, Y., Zuo, M., Jiang, T., Du, J., Ma, J.: Asynchronous consensus of multiple second-order agents with partial state information. Int. J. Syst. Sci. 44(5), 966–977 (2013)
Article MathSciNet MATH Google Scholar
Ghosh, A., Chung, J., Yin, D., Ramchandran, K.: An efficient framework for clustered federated learning. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, 6–12 Dec 2020, virtual (2020)
Google Scholar
Hamer, J., Mohri, M., Suresh, A.T.: Fedboost: a communication-efficient algorithm for federated learning. In: International Conference on Machine Learning, pp. 3973–3983. PMLR (2020)
Google Scholar
Hartigan, J.A., Wong, M.A.: Algorithm as 136: a k-means clustering algorithm. J. Royal Statist. Soc. Ser. C (Appl. Statist.) 28(1), 100–108 (1979)
Google Scholar
Kairouz, P., et al.: Advances and open problems in federated learning. Found. Trends Mach. Learn. 14(1–2), 1–210 (2021)
Google Scholar
Kim, Y., Al Hakim, E., Haraldson, J., Eriksson, H., da Silva, J.M.B., Fischione, C.: Dynamic clustering in federated learning. In: ICC 2021-IEEE International Conference on Communications, pp. 1–6. IEEE (2021)
Google Scholar
Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Federated optimization in heterogeneous networks. In: Proceedings of Machine Learning and Systems 2020, MLSys 2020, 2–4 March 2020. Austin, TX, USA (2020)
Google Scholar
Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of fedavg on non-iid data. In: 8th International Conference on Learning Representations, ICLR 2020, 26–30 April 2020. Addis Ababa, Ethiopia (2020)
Google Scholar
Liu, Q., Chen, E., Xiong, H., Ge, Y., Li, Z., Wu, X.: A cocktail approach for travel package recommendation. IEEE Trans. Knowl. Data Eng. 26(2), 278–293 (2014). https://doi.org/10.1109/TKDE.2012.233
Article Google Scholar
Liu, Q., et al.: EKT: exercise-aware knowledge tracing for student performance prediction. IEEE Trans. Knowl. Data Eng. 33(1), 100–115 (2021). https://doi.org/10.1109/TKDE.2019.2924374
Lyu, L., Xu, X., Wang, Q., Yu, H.: Collaborative fairness in federated learning. In: Yang, Q., Fan, L., Yu, H. (eds.) Federated Learning. LNCS (LNAI), vol. 12500, pp. 189–204. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63076-8_14
Ujjwal, M., Sanghamitra, B.: Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Pattern Anal. Mach. Intell. 24 (2002)
Google Scholar
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)
Google Scholar
Ouadrhiri, A.E., Abdelhadi, A.: Differential privacy for deep and federated learning: a survey. IEEE Access 10, 22359–22380 (2022)
Article Google Scholar
Qi, T., Wu, F., Wu, C., Huang, Y., Xie, X.: Privacy-preserving news recommendation model training via federated learning (2020)
Google Scholar
Sattler, F., Müller, K.R., Samek, W.: Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints. IEEE Trans. Neural Networks Learn. Syst. 99 (2020)
Google Scholar
Shlezinger, N., Rini, S., Eldar, Y.C.: The communication-aware clustered federated learning problem. In: 2020 IEEE International Symposium on Information Theory (ISIT), pp. 2610–2615. IEEE (2020)
Google Scholar
Voigt, P., Von dem Bussche, A.: The EU general data protection regulation (GDPR). A Practical Guide, 1st ed, vol. 10, p. 3152676. Springer, Cham (2017)
Google Scholar
Wu, J., et al.: Federated deep knowledge tracing. In: Proceedings of the 14th ACM International Conference on Web Search and Data Mining, pp. 662–670 (2021)
Google Scholar
Wu, J., et al.: Hierarchical personalized federated learning for user modeling. In: Proceedings of the Web Conference 2021, pp. 957–968 (2021)
Google Scholar
Xie, C., Koyejo, S., Gupta, I.: Asynchronous federated optimization. arXiv preprint arXiv:1903.03934 (2019)
Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM Trans. Intell. Syst. Technol. 10(2), 1–19 (2019)
Article Google Scholar
Yu, H., et al.: A fairness-aware incentive scheme for federated learning. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pp. 393–399 (2020)
Google Scholar
Zhang, B., Jia, Y., Du, J., Zhang, J.: Finite-time synchronous control for multiple manipulators with sensor saturations and a constant reference. IEEE Trans. Control. Syst. Technol. 22(3), 1159–1165 (2014)
Article Google Scholar
Zhang, K., Song, X., Zhang, C., Yu, S.: Challenges and future directions of secure federated learning: a survey. Frontiers Comput. Sci. 16(5), 165817 (2022)
Article Google Scholar
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., Chandra, V.: Federated learning with non-iid data. arXiv preprint arXiv:1806.00582 (2018)
Zhuang, W., et al.: Performance optimization of federated person re-identification via benchmark analysis. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 955–963 (2020)
Google Scholar

Download references

Acknowledgements

This research was partially supported by grant from the National Key Research and Development Program of China (No. 2021YFF0901003), the National Natural Science Foundation of China (Grant No. 61922073 and U20A20229), the Fundamental Research Funds for the Central Universities (Grants No. WK2150110021), and the Iflytek joint research program.

Author information

Authors and Affiliations

Anhui Province Key Laboratory of Big Data Analysis and Application, School of Data Science & School of Computer Science and Technology, University of Science and Technology of China, Hefei, China
Ming Chen, Jinze Wu, Yu Yin, Zhenya Huang, Qi Liu & Enhong Chen
State Key Laboratory of Cognitive Intelligence, Hefei, China
Ming Chen, Jinze Wu, Yu Yin, Zhenya Huang, Qi Liu & Enhong Chen

Authors

Ming Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jinze Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yu Yin
View author publications
You can also search for this author in PubMed Google Scholar
Zhenya Huang
View author publications
You can also search for this author in PubMed Google Scholar
Qi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Enhong Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ming Chen .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Lu Fang
Xiaomi Inc., Beijing, China
Daniel Povey
Shanghai Jiao Tong University, Shanghai, China
Guangtao Zhai
JD Explore Academy, Beijing, China
Tao Mei
Chinese Academy of Sciences, Beijing, China
Ruiping Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, M., Wu, J., Yin, Y., Huang, Z., Liu, Q., Chen, E. (2022). Dynamic Clustering Federated Learning for Non-IID Data. In: Fang, L., Povey, D., Zhai, G., Mei, T., Wang, R. (eds) Artificial Intelligence. CICAI 2022. Lecture Notes in Computer Science(), vol 13606. Springer, Cham. https://doi.org/10.1007/978-3-031-20503-3_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-20503-3_10
Published: 17 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20502-6
Online ISBN: 978-3-031-20503-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics