Abstract
Due to limitations in the performance of client-side systems and constraints on communication costs, most existing Federated Learning (FL) algorithms cannot involve all clients in training. Therefore, randomly selecting some clients to participate in FL training is used in practice. However, the datasets held by clients often exhibit non-independent and identically distributed (Non-IID) characteristics. This method of randomly selecting clients can lead to training the global model on more unbalanced datasets, ultimately decreasing the global model’s performance. To effectively mitigate the impact of dataset imbalance on Federated Learning (FL), in this paper, we propose a class-balanced sampling method based on the grouping of the number of client classes - FedCCBS (Federated Client Class Balanced Sampling). It aims to select clients with complementary datasets for training, thereby alleviating the adverse effects of imbalanced datasets on the global model. We conducted experiments on the MNIST, FASHION MNIST and CIFAR-10 datasets, and the experimental results demonstrated that FedCCBS achieves faster convergence and maintains a more stable convergence process. Moreover, the classification accuracy of FedCCBS surpasses that of other baseline algorithms.
Supported by National Natural Science Foundation of China (No.62366052), Natural Science Foundation of Xinjiang Uygur Autonomous Region (No.2022D01C429, 2022D01C427), Key R&D Program of Xinjiang Uygur Autonomous Region (No.2022B01046), College Scientific Research Project Plan of Xinjiang Autonomous Region (No. XJEDU2020Y003).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistic, pp. 1273–1282. PMLR(2017)
Feng, B., Shi, J., Huang, L., et al.: Robustly federated learning model for identifying high-risk patients with postoperative gastric cancer recurrence. Nat. Commun. 15(1), 742 (2024)
Mahon, P., Chatzitheofilou, I., Dekker, A., et al.: A federated learning system for precision oncology in Europe: DigiONE. Nat. Med. 30, 1–4 (2024)
Lee, K.J., Jeong, B., Kim, S., et al.: General commerce intelligence: glocally federated NLP-based engine for privacy-preserving and sustainable personalized services of multi-merchant. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, pp. 22752–22760 (2024)
Mandal, S.: A privacy preserving federated learning (PPFL) based cognitive digital twin (CDT) framework for smart cities. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, pp. 23399–23400 (2024)
von, Wahl, L., Heidenreich,N., Mitra, P., et al.: Data disparity and temporal unavailability aware asynchronous federated learning for predictive maintenance on transportation fleets. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 15420–15428 (2024)
Fu L., Zhang, H., Gao,G., et al.: Client selection in federated learning: principles, challenges, and opportunities. IEEE Internet Things J. 10(24), 21811 (2023)
Li, Q., Diao, Y., Chen, Q., et al.: Federated learning on non-iid data silos: an experimental study. In: 2022 IEEE 38th International Conference on Data Engineering, pp. 965–978. IEEE (2022)
Li, T., Sahu, A.K., Zaheer, M., et al.: Federated optimization in heterogeneous networks. Proc. Mach. Learn. Syst. 2, 429–450 (2020)
Zhao, Y., et al.: Federated learning with non-iid data (2018). ArXiv:1806.00582
Sattler, F., Wiedemann, S., Müller, K.R., Samek, W.: Robust and communication-efficient federated learning from non-i.i.d. data. IEEE Trans. Neural Netw. Learn. Syst. 31(9), 3400–3413 (2020)
Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of fedavg on non-iid data (2019). arXiv: 1907.02189
Zhang, J., Li,A., Tang, M., et al.: Fed-cbs: A heterogeneity-aware client sampling mechanism for federated learning via class-imbalance reduction. In: International Conference on Machine Learning, pp. 41354–41381. PMLR (2023)
Cho, Y.J., Wang, J., Joshi, G.: Towards understanding biased client selection in federated learning. In: International Conference on Artificial Intelligence and Statistics, pp. 10351–10375. PMLR(2022)
Tang, M., Ning, X., Wang, Y., et al.: FedCor: Correlation-based active client selection strategy for heterogeneous federated learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10102–10111 (2022)
Nagalapatti, L., Narayanam, R.: Game of gradients: mitigating irrelevant clients in federated learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 9046–9054 (2021)
Sultana. A, Haque, M, M., Chen, L., et al.: Eiffel: Efficient and fair scheduling in adaptive federated learning. IEEE Trans. Parallel Distrib. Syst. 33(12), 4282–4294 (2022)
Huang, W., Li, T., Wang, D., et al.: Fairness and accuracy in federated learning. Inf. Sci. 589, 170–185 (2022)
Smestad, C., Li, J.: A systematic literature review on client selection in federated learning. In: Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering, pp. 2–11(2023)
Wang, Z., Fan, X., Qi, J., et al.: Fedgs: Federated graph-based sampling with arbitrary client availability. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 10271–10278 (2023)
Fraboni, Y., Vidal, R., Kameni, L., et al.: Clustered sampling: Low-variance and improved representativity for clients selection in federated learning. In: International Conference on Machine Learning, pp. 3407–3416. PMLR (2021)
Song, D., Shen, G., Gao, D., et al.: Fast heterogeneous federated learning with hybrid client selection. In: Uncertainty in Artificial Intelligence, pp. 2006–2015. PMLR (2023)
Wang, L., Guo, Y.X., Lin, T., et al.: Delta: Diverse client sampling for fasting federated learning. Adv. Neural Inf. Process. Syst. (2024)
Ma, J., Sun, X., Xia, W., et al.: Client selection based on label quantity information for federated learning. In: 2021 IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications, pp. 1–6. IEEE (2021)
Yang, M., Wang, X., Zhu, H., et al.: Federated learning with class imbalance reduction. In: 2021 29th European Signal Processing Conference, pp. 2174–2178. IEEE (2021)
Masko, D., Hensman, P.: The impact of imbalanced training data for convolutional neural networks (2015)
Wang, J., Liu, Q., Liang, H., et al.: Tackling the objective inconsistency problem in heterogeneous federated optimization. Adv. Neural Inf. Process. Syst. 33, 7611–7623 (2020)
LeCun, Y., Bottou, L., Bengio, Y., Ha, P.: LeNet. Proc. IEEE 1–46 (1998)
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, L., Lin, C., Bie, Z., Li, S., Bi, X., Zhao, K. (2025). Client Selection Mechanism for Federated Learning Based on Class Imbalance. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15031. Springer, Singapore. https://doi.org/10.1007/978-981-97-8487-5_19
Download citation
DOI: https://doi.org/10.1007/978-981-97-8487-5_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-8486-8
Online ISBN: 978-981-97-8487-5
eBook Packages: Computer ScienceComputer Science (R0)