Abstract
Federated Learning (FL) has emerged as a privacy-preserving method for training machine learning models in a distributed manner on edge devices. However, on-device models face inherent computational power and memory limitations, potentially resulting in constrained gradient updates. As the model’s size increases, the frequency of gradient updates on edge devices decreases, ultimately leading to suboptimal training outcomes during any particular FL round. This limits the feasibility of deploying advanced and large-scale models on edge devices, hindering the potential for performance enhancements. To address this issue, we propose FedRepOpt, a gradient re-parameterized optimizer for FL. The gradient re-parameterized method allows training a simple local model with a similar performance as a complex model by modifying the optimizer’s gradients according to a set of model-specific hyperparameters obtained from the complex models. In this work, we focus on VGG-style and Ghost-style models in the FL environment. Extensive experiments demonstrate that models using FedRepOpt obtain a significant boost in performance of \(16.7\%\) and \(11.4\%\) compared to the RepGhost-style and RepVGG-style networks, while also demonstrating a faster convergence time of \(11.7\%\) and \(57.4\%\) compared to their complex structure. Codes are available at https://github.com/StevenLauHKHK/FedRepOpt.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Acar, D.A.E., Zhao, Y., Navarro, R.M., Mattina, M., Whatmough, P.N., Saligrama, V.: Federated learning based on dynamic regularization. arXiv preprint arXiv:2111.04263 (2021)
Alistarh, D., Grubic, D., Li, J., Tomioka, R., Vojnovic, M.: Qsgd: Communication-efficient sgd via gradient quantization and encoding. Advances in neural information processing systems 30 (2017)
Beutel, D.J., Topal, T., Mathur, A., Qiu, X., Parcollet, T., Lane, N.D.: Flower: A friendly federated learning research framework. arXiv preprint arXiv:2007.14390 (2020)
Chen, C., Xu, H., Wang, W., Li, B., Li, B., Chen, L., Zhang, G.: Communication-efficient federated learning with adaptive parameter freezing. In: 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS). pp. 1–11. IEEE (2021)
Chen, C., Guo, Z., Zeng, H., Xiong, P., Dong, J.: Repghost: A hardware-efficient ghost module via re-parameterization. arXiv e-prints pp. arXiv–2211 (2022)
Ding, X., Chen, H., Zhang, X., Huang, K., Han, J., Ding, G.: Re-parameterizing your optimizers rather than architectures. In: The Eleventh International Conference on Learning Representations (2023), https://openreview.net/forum?id=B92TMCG_7rp
Ding, X., Zhang, X., Han, J., Ding, G.: Diverse branch block: Building a convolution as an inception-like unit. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 10886–10895 (2021)
Ding, X., Zhang, X., Han, J., Ding, G.: Scaling up your kernels to 31x31: Revisiting large kernel design in cnns. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 11963–11975 (2022)
Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., Sun, J.: Repvgg: Making vgg-style convnets great again. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 13733–13742 (2021)
Dryden, N., Moon, T., Jacobs, S.A., Van Essen, B.: Communication quantization for data-parallel training of deep neural networks. In: 2016 2nd Workshop on Machine Learning in HPC Environments (MLHPC). pp. 1–8. IEEE (2016)
Gao, Y., Parcollet, T., Zaiem, S., Fernandez-Marques, J., de Gusmao, P.P., Beutel, D.J., Lane, N.D.: End-to-end speech recognition from federated acoustic models. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 7227–7231. IEEE (2022)
Hsieh, K., Harlap, A., Vijaykumar, N., Konomis, D., Ganger, G.R., Gibbons, P.B., Mutlu, O.: Gaia:\(\{\)Geo-Distributed\(\}\) machine learning approaching \(\{\)LAN\(\}\) speeds. In: 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). pp. 629–647 (2017)
Karimireddy, S.P., Kale, S., Mohri, M., Reddi, S., Stich, S., Suresh, A.T.: Scaffold: Stochastic controlled averaging for federated learning. In: International conference on machine learning. pp. 5132–5143. PMLR (2020)
Konečnỳ, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016)
Li, X., Jiang, M., Zhang, X., Kamp, M., Dou, Q.: Fedbn: Federated learning on non-iid features via local batch normalization. arXiv preprint arXiv:2102.07623 (2021)
Li, Y., Tao, X., Zhang, X., Liu, J., Xu, J.: Privacy-preserved federated learning for autonomous driving. IEEE Trans. Intell. Transp. Syst. 23(7), 8423–8434 (2021)
Liu, W., Chen, L., Chen, Y., Zhang, W.: Accelerating federated learning via momentum gradient descent. IEEE Trans. Parallel Distrib. Syst. 31(8), 1754–1766 (2020)
Liu, Y., Nie, J., Li, X., Ahmed, S.H., Lim, W.Y.B., Miao, C.: Federated learning in the sky: Aerial-ground air quality sensing framework with uav swarms. IEEE Internet Things J. 8(12), 9827–9837 (2020)
Long, G., Tan, Y., Jiang, J., Zhang, C.: Federated learning for open banking. In: Federated Learning: Privacy and Incentive, pp. 240–254. Springer (2020)
Luping, W., Wei, W., Bo, L.: Cmfl: Mitigating communication overhead for federated learning. In: 2019 IEEE 39th international conference on distributed computing systems (ICDCS). pp. 954–964. IEEE (2019)
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics. pp. 1273–1282. PMLR (2017)
Nguyen, D.C., Ding, M., Pathirana, P.N., Seneviratne, A., Li, J., Poor, H.V.: Federated learning for internet of things: A comprehensive survey. IEEE Communications Surveys & Tutorials 23(3), 1622–1658 (2021)
Qayyum, A., Ahmad, K., Ahsan, M.A., Al-Fuqaha, A., Qadir, J.: Collaborative federated learning for healthcare: Multi-modal covid-19 diagnosis at the edge. IEEE Open Journal of the Computer Society 3, 172–184 (2022)
Reddi, S.J., Charles, Z., Zaheer, M., Garrett, Z., Rush, K., Konečný, J., Kumar, S., McMahan, H.B.: Adaptive federated optimization. In: International Conference on Learning Representations (2021), https://openreview.net/forum?id=LkFG3lB13U5
Rehman, Y.A.U., Gao, Y., de Gusmao, P.P.B., Alibeigi, M., Shen, J., Lane, N.D.: L-dawa: Layer-wise divergence aware weight aggregation in federated self-supervised visual representation learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 16464–16473 (2023)
Seide, F., Fu, H., Droppo, J., Li, G., Yu, D.: 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns. In: Fifteenth annual conference of the international speech communication association (2014)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Ström, N.: Scalable distributed dnn training using commodity gpu cloud computing (2015)
Van Horn, G., Mac Aodha, O., Song, Y., Cui, Y., Sun, C., Shepard, A., Adam, H., Perona, P., Belongie, S.: The inaturalist species classification and detection dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 8769–8778 (2018)
Wang, H., Yurochkin, M., Sun, Y., Papailiopoulos, D., Khazaeni, Y.: Federated learning with matched averaging. arXiv preprint arXiv:2002.06440 (2020)
Wang, Y., Shi, Q., Chang, T.H.: Why batch normalization damage federated learning on non-iid data? IEEE Transactions on Neural Networks and Learning Systems (2023)
Xianjia, Y., Queralta, J.P., Heikkonen, J., Westerlund, T.: Federated learning in robotic and autonomous systems. Procedia Computer Science 191, 135–142 (2021)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., Chandra, V.: Federated learning with non-iid data. arXiv preprint arXiv:1806.00582 (2018)
Zhuang, W., Gan, X., Wen, Y., Zhang, S., Yi, S.: Collaborative unsupervised visual representation learning from decentralized data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 4912–4921 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Lau, K.W., Rehman, Y.A.U., Porto Buarque de Gusmão, P., Po, LM., Ma, L., Xie, Y. (2025). FedRepOpt: Gradient Re-parametrized Optimizers in Federated Learning. In: Cho, M., Laptev, I., Tran, D., Yao, A., Zha, H. (eds) Computer Vision – ACCV 2024. ACCV 2024. Lecture Notes in Computer Science, vol 15479. Springer, Singapore. https://doi.org/10.1007/978-981-96-0966-6_5
Download citation
DOI: https://doi.org/10.1007/978-981-96-0966-6_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-0965-9
Online ISBN: 978-981-96-0966-6
eBook Packages: Computer ScienceComputer Science (R0)