Skip to main content
Log in

FedAda: Fast-convergent adaptive federated learning in heterogeneous mobile edge computing environment

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

With rapid advancement of Internet of Things (IoT) and social networking applications generating large amounts of data at or close to the network edge, Mobile Edge Computing (MEC) has naturally been proposed to bring model training closer to where data is produced. However, there still exists privacy concern since typical MEC frameworks need to transfer sensitive data resources from data collection end devices/clients to MEC server. So the concept of Federated Learning (FL) has been introduced which supports privacy-preserved collaborative machine learning involving multiple clients coordinated by the MEC server without centralizing the private data. Unfortunately, FL is prone to multiple challenges: 1) systems heterogeneity between clients causes straggler issue, and 2) statistical heterogeneity between clients brings about objective inconsistency problem, both of which may lead to a significant slow-down in the convergence speed in heterogeneous MEC environment. In this paper, we propose a novel framework, FedAda (Federated Adaptive Training), that incorporates systems capabilities and data characteristics of the clients to adaptively assign appropriate workload to each client. The key idea is that instead of running a fixed number of local training iterations as in Federated Averaging (FedAvg), our algorithm adopts an adaptive workload assignment strategy by minimizing the runtime gap between clients and maximizing convergence gain in heterogeneous MEC environment. Moreover, we design a light mechanism extending FedAda to accelerate the convergence speed by further fine-tuning the workload assignment based on the global convergence status in each communication round. We evaluate FedAda on CIFAR-10 dataset to explore the performance of the algorithm in the simulated heterogeneous MEC environment. Experimental results show that FedAda is able to assign appropriate amount of workload to each client and substantially reduces the convergence time by up to 49.5% compared to FedAvg in heterogeneous MEC environment. In addition, we demonstrate that fine-tuning the workload assignment can help FedAda improve the learning performance in heterogeneous mobile edge computing environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Beutel, D.J., Topal, T., Mathur, A., Qiu, X., Parcollet, T., Lane, N.D.: Flower: A friendly federated learning research framework. arXiv preprint arXiv:2007.14390 (2020)

  2. Bi, R., Liu, Q., Ren, J., Tan, G.: Utility aware offloading for mobile-edge computing. Tsinghua Science and Technology 26(2), 239–250 (2020)

    Article  Google Scholar 

  3. Bonawitz, K., Eichner, H., Grieskamp, W., Huba, D., Ingerman, A., Ivanov, V., Kiddon, C., Konečný, J., Mazzocchi, S., McMahan, B., Van Overveldt, T., Petrou, D., Ramage, D., Roselander, J.: Towards federated learning at scale: System design. In: A. Talwalkar, V. Smith, M. Zaharia (eds.) Proceedings of Machine Learning and Systems, vol. 1, pp. 374–388 (2019)

  4. Chai, Z., Fayyaz, H., Fayyaz, Z., Anwar, A., Zhou, Y., Baracaldo, N., Ludwig, H., Cheng, Y.: Towards taming the resource and data heterogeneity in federated learning. In: 2019 {USENIX} Conference on Operational Machine Learning (OpML 19), pp. 19–21 (2019)

  5. Chen, T., Giannakis, G.B., Sun, T., Yin, W.: Lag: Lazily aggregated gradient for communication-efficient distributed learning. arXiv preprint arXiv:1805.09965 (2018)

  6. Chen, X., Jiao, L., Li, W., Fu, X.: Efficient multi-user computation offloading for mobile-edge cloud computing. IEEE/ACM Trans. Netw. 24(5), 2795–2808 (2015)

    Article  Google Scholar 

  7. Ding, S., Li, L., Li, Z., Wang, H., Zhang, Y.: Smart electronic gastroscope system using a cloud-edge collaborative framework. Future Generation Computer Systems 100, 395–407 (2019)

    Article  Google Scholar 

  8. Dinh, C.T., Tran, N.H., Nguyen, M.N., Hong, C.S., Bao, W., Zomaya, A.Y., Gramoli, V.: Federated learning over wireless networks: Convergence analysis and resource allocation. IEEE/ACM Trans. Netw. 29(1), 398–409 (2020)

    Article  Google Scholar 

  9. Hsu, T.M.H., Qi, H., Brown, M.: Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335 (2019)

  10. Konečnỳ, J., McMahan, H.B., Ramage, D., Richtárik, P.: Federated optimization: Distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527 (2016)

  11. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

  12. Lai, P., He, Q., Cui, G., Chen, F., Grundy, J., Abdelrazek, M., Hosking, J.G., Yang, Y.: Cost-effective user allocation in 5g noma-based mobile edge computing systems. IEEE Transactions on Mobile Computing (2021)

  13. Lai, P., He, Q., Xia, X., Chen, F., Abdelrazek, M., Grundy, J., Hosking, J.G., Yang, Y.: Dynamic user allocation in stochastic mobile edge computing systems. IEEE Transactions on Services Computing (2021)

  14. Li, B., He, Q., Cui, G., Xia, X., Chen, F., Jin, H., Yang, Y.: Read: Robustness-oriented edge application deployment in edge computing environment. IEEE Transactions on Services Computing (2020)

  15. Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: Challenges, methods, and future directions. IEEE Signal Process. Mag. 37(3), 50–60 (2020)

    Article  Google Scholar 

  16. Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Federated optimization in heterogeneous networks. In: I. Dhillon, D. Papailiopoulos, V. Sze (eds.) Proceedings of Machine Learning and Systems, vol. 2, pp. 429–450 (2020)

  17. Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of fedavg on non-iid data. In: International Conference on Learning Representations (2020)

  18. Lim, W.Y.B., Luong, N.C., Hoang, D.T., Jiao, Y., Liang, Y.C., Yang, Q., Niyato, D., Miao, C.: Federated learning in mobile edge networks: A comprehensive survey. IEEE Communications Surveys & Tutorials 22(3), 2031–2063 (2020)

    Article  Google Scholar 

  19. Lin, T., Stich, S.U., Patel, K.K., Jaggi, M.: Don’t use large mini-batches, use local sgd. In: International Conference on Learning Representations (2020)

  20. Luping, W., Wei, W., Bo, L.: Cmfl: Mitigating communication overhead for federated learning. In: 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), pp. 954–964. IEEE (2019)

  21. Ma, Z., Xu, Y., Xu, H., Meng, Z., Huang, L., Xue, Y.: Adaptive batch size for federated learning in resource-constrained edge computing. IEEE Transactions on Mobile Computing(2021)

  22. Mao, Y., You, C., Zhang, J., Huang, K., Letaief, K.B.: A survey on mobile edge computing: The communication perspective. IEEE Communications Surveys & Tutorials 19(4), 2322–2358 (2017)

    Article  Google Scholar 

  23. McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)

  24. Nishio, T., Yonetani, R.: Client selection for federated learning with heterogeneous resources in mobile edge. In: ICC 2019-2019 IEEE International Conference on Communications (ICC), pp. 1–7. IEEE (2019)

  25. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: Pytorch: An imperative style, high-performance deep learning library. In: H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E. Fox, R. Garnett (eds.) Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc. (2019)

  26. Patel, M.., Naughton, B.., Chan, C.., Sprecher, N.., Abeta, S.., Neal, A., et al.: Mobile-edge computing introductory technical white paper. White Paper, Mobile-edge Computing (MEC) Industry Initiative 29, 854–864 (2014)

    Google Scholar 

  27. Rydning, D.R.J.G.J.: The digitization of the world from edge to core. Framingham: International Data Corporation p. 16 (2018)

  28. Shi, W., Cao, J., Zhang, Q., Li, Y., Xu, L.: Edge computing: Vision and challenges. IEEE Internet of Things Journal 3(5), 637–646 (2016)

    Article  Google Scholar 

  29. Stich, S.U.: Local SGD converges fast and communicates little. In: International Conference on Learning Representations (2019)

  30. Tan, H., Han, Z., Li, X.Y., Lau, F.C.: Online job dispatching and scheduling in edge-clouds. In: IEEE INFOCOM 2017-IEEE Conference on Computer Communications, pp. 1–9. IEEE (2017)

  31. Voigt, P., Von dem Bussche, A.: The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing 10, 3152,676 (2017)

  32. Wang, C., Yang, Y., Zhou, P.: Towards efficient scheduling of federated mobile devices under computational and statistical heterogeneity. IEEE Transactions on Parallel and Distributed Systems 32(2), 394–410 (2021)

    Article  Google Scholar 

  33. Wang, F., Nian, H., Huang, X., He, Q., Yang, Y., Zhang, C.: Energy saving strategy on mobile devices under mobile cloud systems. In: 2019 4th International Conference on Communication and Information Systems (ICCIS), pp. 229–235. IEEE (2019)

  34. Wang, H., Kaplan, Z., Niu, D., Li, B.: Optimizing federated learning on non-iid data with reinforcement learning. In: IEEE INFOCOM 2020-IEEE Conference on Computer Communications, pp. 1698–1707. IEEE (2020)

  35. Wang, J., Joshi, G.: Adaptive communication strategies to achieve the best error-runtime trade-off in local-update sgd. In: A. Talwalkar, V. Smith, M. Zaharia (eds.) Proceedings of Machine Learning and Systems, vol. 1, pp. 212–229 (2019)

  36. Wang, J., Liu, Q., Liang, H., Joshi, G., Poor, H.V.: Tackling the objective inconsistency problem in heterogeneous federated optimization. In: H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, H. Lin (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 7611–7623. Curran Associates, Inc. (2020)

  37. Wang, S., Tuor, T., Salonidis, T., Leung, K.K., Makaya, C., He, T., Chan, K.: Adaptive federated learning in resource constrained edge computing systems. IEEE Journal on Selected Areas in Communications 37(6), 1205–1221 (2019)

    Article  Google Scholar 

  38. Wei, D., Ning, H., Shi, F., Wan, Y., Xu, J., Yang, S., Zhu, L.: Dataflow management in the internet of things: Sensing, control, and security. Tsinghua Sci. Technol. 26(6), 918–930 (2021)

    Article  Google Scholar 

  39. Xia, W., Quek, T.Q., Guo, K., Wen, W., Yang, H.H., Zhu, H.: Multi-armed bandit-based client scheduling for federated learning. IEEE Transactions on Wireless Communications 19(11), 7108–7123 (2020)

    Article  Google Scholar 

  40. Xiao, Y., Krunz, M.: Qoe and power efficiency tradeoff for fog computing networks with fog node cooperation. In: IEEE INFOCOM 2017-IEEE Conference on Computer Communications, pp. 1–9. IEEE (2017)

  41. Xie, C., Koyejo, S., Gupta, I.: Asynchronous federated optimization. arXiv preprint arXiv:1903.03934 (2019)

  42. You, C., Huang, K., Chae, H., Kim, B.H.: Energy-efficient resource allocation for mobile-edge computation offloading. IEEE Transactions on Wireless Communications 16(3), 1397–1411 (2016)

    Article  Google Scholar 

  43. Yu, H., Yang, S., Zhu, S.: Parallel restarted sgd with faster convergence and less communication: Demystifying why model averaging works for deep learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 5693–5700 (2019)

  44. Zhang, H., Xie, Z., Zarei, R., Wu, T., Chen, K.: Adaptive client selection in resource constrained federated learning systems: A deep reinforcement learning approach. IEEE Access 9, 98423–98432 (2021)

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by National Key R&D Program of China 2018AAA0100500, National Natural Science Foundation of China under Grants No. 61972085, 61872079, 61632008, 62072099 Jiangsu Provincial Key Laboratory of Network and Information Security under Grants No.BM2003201, Key Laboratory of Computer Network and Information Integration of Ministry of Education of China under Grants No.93K-9, Southeast University-China Mobile Research Institute Joint Innovation Center (No.R21701010102018), the University Synergy Innovation Program of Anhui Province No.GXXT-2020-012, and partially supported by Collaborative Innovation Center of Novel Software Technology and Industrialization, the Fundamental Research Funds for the Central Universities and CCF-Baidu Open Fund (No.2021PP15002000). We also thank the Big Data Computing Center of Southeast University for providing the experiment environment and computing facility.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiahui Jin.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

This article belongs to the Topical Collection: Special Issue on Resource Management at the Edge for Future Web, Mobile and IoT Applications

Qiang He, Fang Dong, Chenshu Wu, and Yun Yang

Appendix A: Derivation of the analytical solution of the optimization objective function

Appendix A: Derivation of the analytical solution of the optimization objective function

Since \(f(w^*)=0\) and \(\tau _i^{\prime } = \beta \tau _i\), Equation (9) can be turned into:

$$\begin{aligned} \frac{f(w^t)}{\beta \sum _{i=1}^{N}{ \tau _i}} + \lambda { { \beta ^2 \tau _{max}}^2} \end{aligned}$$
(17)

And then multiply Equation (17) by the constant \(\sum _{i=1}^N\tau _i\), producing a new objective function:

$$\begin{aligned} h(\beta ) = \frac{f(w^t)}{\beta } + \lambda { { \beta ^2 \tau _{max}}^2\sum _{i=1}^{N}{ \tau _i}} \end{aligned}$$
(18)

Since at the beginning of each communication round, every \(\tau _i\) is already known, so let \(K={ \tau _{max}}^2\sum _{i=1}^{N}{ \tau _i}\) and the optimization objective becomes:

$$\begin{aligned} h(\beta ) = \frac{f(w^t)}{\beta } + \lambda \beta ^2K \end{aligned}$$
(19)

Take the derivative with respect to h:

$$\begin{aligned} \frac{dh}{d\beta } = -\frac{f(w^t)}{\beta ^2} + 2\lambda \beta K \end{aligned}$$
(20)

Let \(\frac{dh}{d\beta } = 0\), we can get the analytical solution \(\beta = \root 3 \of {\frac{f(w^t)}{2\lambda K}}\) of the optimization objective function \(h(\beta )\) with respect to \(\beta\). The value of \(\beta\) shall decrease with the decrease of the average loss function \(f(w^t)\).

At the beginning of training process, the decay value should be \(\beta =1\), and the loss function value on the initial model should be \(f(w^0)\). It’s easy to achieve \(\beta = \root 3 \of {\frac{f(w^0)}{2\lambda K}} = 1\), so the constant \(\lambda =\frac{f(w^0)}{2K}\). By substituting the constant into the analytical solution, the optimal solution of the optimization objective function in the communication round t can be obtained as Equation (16).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Cheng, X., Wang, C. et al. FedAda: Fast-convergent adaptive federated learning in heterogeneous mobile edge computing environment. World Wide Web 25, 1971–1998 (2022). https://doi.org/10.1007/s11280-021-00989-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-021-00989-x

Keywords

Navigation