Abstract
Federated Learning (FL) is a promising framework for distributed learning when data is private and sensitive. However, the state-of-the-art solutions in this framework are not optimal when data is heterogeneous and non-IID. We propose a practical and robust approach to personalization in FL that adjusts to heterogeneous and non-IID data by balancing exploration and exploitation of several global models. To achieve our aim of personalization, we use a Mixture of Experts (MoE) that learns to group clients that are similar to each other, while using the global models more efficiently. We show that our approach achieves an accuracy up to 29.78% better than the state-of-the-art and up to 4.38% better compared to a local model in a pathological non-IID setting, even though we tune our approach in the IID setting.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The source code for the experiments can be found at https://github.com/EricssonResearch/fl-moe.
References
Bonawitz, K., Eichner, H., et al.: Towards federated learning at scale: system design. In: Proceedings of Machine Learning and Systems (MLSys), Stanford, CA, USA (2019). https://proceedings.mlsys.org/book/271.pdf
Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018). https://doi.org/10.1137/16M1080173, https://epubs.siam.org/doi/10.1137/16M1080173
Briggs, C., Fan, Z., Andras, P.: Federated learning with hierarchical clustering of local updates to improve training on non-IID data. In: International Joint Conference on Neural Networks (IJCNN), Glasgow, United Kingdom. IEEE (2020). https://doi.org/10.1109/IJCNN48605.2020.9207469
Caldas, S., et al.: LEAF: a benchmark for federated settings. CoRR abs/1812.01097 (2018). http://arxiv.org/abs/1812.01097
Cohen, G., Afshar, S., Jonathan, T., van Schaik, A.: EMNIST: extending MNIST to handwritten letters. In: International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA. IEEE (2017). https://doi.org/10.1109/IJCNN.2017.7966217
Deng, Y., Kamani, M.M., Mahdavi, M.: Adaptive personalized federated learning. CoRR abs/2003.13461 (2020). https://arxiv.org/abs/2003.13461
Diao, E., Ding, J., Tarokh, V.: HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients. In: 9th International Conference on Learning Representations (ICLR), Austria (2021). https://openreview.net/forum?id=TNkPBBYFkXg
Berggren, V., Inam, R., Mokrushin, L., Hata, A., Jeong, J., Mohalik, S.K., Forgeat, J., Sorrentino, S.: Artificial intelligence and machine learning in next-generation systems. Technical report, Ericsson Research, Ericsson AB (2018). https://www.ericsson.com/en/white-papers/machine-intelligence
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning (ICML), Sydney, NSW, Australia. PMLR (2017). https://doi.org/10.5555/3305381.3305498, http://proceedings.mlr.press/v70/finn17a.html
GDPR: Regulation (EU) 2016/679 on the protection of natural persons with regard to the processing of personal data and the free movement of such data (2016). https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:02016R0679-20160504
Ghosh, A., Chung, J., Yin, D., Ramchandran, K.: An efficient framework for clustered federated learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2020). https://proceedings.neurips.cc/paper/2020/file/e32cc80bf07915058ce90722ee17bb71-Paper.pdf
Hanzely, F., Richtárik, P.: Federated learning of a mixture of global and local models. CoRR abs/2002.05516 (2020). https://arxiv.org/abs/2002.05516
He, C., Annavaram, M., Avestimehr, S.: FedNAS: federated deep learning via neural architecture search. CoRR abs/2004.08546 (2020). https://arxiv.org/abs/2004.08546
Hsieh, K., Phanishayee, A., Mutlu, O., Gibbons, P.B.: The non-IID data quagmire of decentralized machine learning. In: Proceedings of the 37th International Conference on Machine Learning (ICML). PMLR (2020). http://proceedings.mlr.press/v119/hsieh20a.html
Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991). https://doi.org/10.1162/neco.1991.3.1.79, https://ieeexplore.ieee.org/document/6797059
Jeong, E., Oh, S., Kim, H., Park, J., Bennis, M., Kim, S.L.: Communication-efficient on-device machine learning: federated distillation and augmentation under non-IID private data. CoRR abs/1811.11479 (2018). http://arxiv.org/abs/1811.11479
Jiang, Y., Konečný, J., Rush, K., Kannan, S.: Improving federated learning personalization via model agnostic meta learning. CoRR abs/1909.12488 (2019). http://arxiv.org/abs/1909.12488
Kairouz, P., McMahan, H.B., et al., B.A.: Advances and open problems in federated learning, vol. 14, pp. 1–210 (2021). https://doi.org/10.1561/2200000083
Kim, Y., Hakim, E.A., Haraldson, J., Eriksson, H., da Silva Jr., J.M.B., Fischione, C.: Dynamic clustering in federated learning. In: International Conference on Communications (ICC), Montreal, QC, Canada, pp. 1–6. IEEE (2021). https://doi.org/10.1109/ICC42927.2021.9500877
Krizhevsky, A.: Learning multiple layers of features from tiny images (2009). https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf
Li, D., Wang, J.: FedMD: heterogenous federated learning via model distillation. CoRR abs/1910.03581 (2019). http://arxiv.org/abs/1910.03581
Li, T., Hu, S., Beirami, A., Smith, V.: Ditto: fair and robust federated learning through personalization. In: Proceedings of the 38th International Conference on Machine Learning (ICML). PMLR (2021). http://proceedings.mlr.press/v139/li21h.html
Liaw, R., Liang, E., Nishihara, R., Moritz, P., Gonzalez, J.E., Stoica, I.: Tune: a research platform for distributed model selection and training. CoRR abs/1807.05118 (2018). http://arxiv.org/abs/1807.05118
Listo Zec, E., Mogren, O., Martinsson, J., Sütfeld, L.R., Gillblad, D.: Federated learning using a mixture of experts. CoRR abs/2010.02056 (2020). https://arxiv.org/abs/2010.02056
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: 7th International Conference on Learning Representations (ICLR), New Orleans, LA, USA (2019). https://openreview.net/forum?id=Bkg6RiCqY7
Mansour, Y., Mohri, M., Ro, J., Suresh, A.T.: Three approaches for personalization with applications to federated learning. CoRR abs/2002.10619 (2020). https://arxiv.org/abs/2002.10619
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), Fort Lauderdale, FL, USA. PMLR (2017). http://proceedings.mlr.press/v54/mcmahan17a.html
Mu, X., et al.: FedProc: prototypical contrastive federated learning on non-IID data. CoRR abs/2109.12273 (2021). https://arxiv.org/abs/2109.12273
Peterson, D.W., Kanani, P., Marathe, V.J.: Private federated learning with domain adaptation. CoRR abs/1912.06733 (2019). http://arxiv.org/abs/1912.06733
Shi, C., Shen, C., Yang, J.: Federated multi-armed bandits with personalization. In: The 24th International Conference on Artificial Intelligence and Statistics (AISTATS) (2021). http://proceedings.mlr.press/v130/shi21c.html
Sutton, R.S.: Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Advances in Neural Information Processing Systems (NeurIPS). MIT Press (1995). https://proceedings.neurips.cc/paper/1995/file/8f1d43620bc6bb580df6e80b0dc05c48-Paper.pdf
Tijani, S.A., Ma, X., Zhang, R., Jiang, F., Doss, R.: Federated learning with extreme label skew: a data extension approach. In: International Joint Conference on Neural Networks (IJCNN), Shenzhen, China. IEEE (2021). https://doi.org/10.1109/IJCNN52387.2021.9533879
Wang, K., Mathews, R., Kiddon, C., Eichner, H., Beaufays, F., Ramage, D.: Federated evaluation of on-device personalization. CoRR abs/1910.10252 (2019). http://arxiv.org/abs/1910.10252
Wu, S., et al.: Motley: benchmarking heterogeneity and personalization in federated learning. CoRR abs/2206.09262 (2022). https://doi.org/10.48550/arXiv.2206.09262
Acknowledgment
This work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.
The computations were enabled by the supercomputing resource Berzelius provided by National Supercomputer Centre at Linköping University and the Knut and Alice Wallenberg foundation.
We thank all reviewers who made suggestions to help improve and clarify this manuscript, especially Dr. A. Alam, F. Cornell, Dr. R. Gaigalas, T. Kvernvik, C. Svahn, F. Vannella, Dr. H. Shokri Ghadikolaei, D. Sandberg and Prof. S. Haridi.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Isaksson, M., Listo Zec, E., Cöster, R., Gillblad, D., Girdzijauskas, S. (2023). Adaptive Expert Models for Federated Learning. In: Goebel, R., Yu, H., Faltings, B., Fan, L., Xiong, Z. (eds) Trustworthy Federated Learning. FL 2022. Lecture Notes in Computer Science(), vol 13448. Springer, Cham. https://doi.org/10.1007/978-3-031-28996-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-28996-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28995-8
Online ISBN: 978-3-031-28996-5
eBook Packages: Computer ScienceComputer Science (R0)