Initialize with Mask: For More Efficient Federated Learning

Zhu, Zirui; Sun, Lifeng

doi:10.1007/978-3-030-67835-7_10

Zirui Zhu¹⁵ &
Lifeng Sun¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12573))

Included in the following conference series:

International Conference on Multimedia Modeling

1918 Accesses

Abstract

Federated Learning (FL) is a machine learning framework proposed to utilize the large amount of private data of edge nodes in a distributed system. Data at different edge nodes often shows strong heterogeneity, which makes the convergence speed of federated learning slow and the trained model does not perform well at the edge. In this paper, we propose Federated Mask (FedMask) to address this problem. FedMask uses Fisher Information Matrix (FIM) as a mask when initializing the local model with the global model to retain the most important parameters for the local task in the local model. Meanwhile, FedMask uses Maximum Mean Discrepancy (MMD) constraint to avoid the instability of the training process. In addition, we propose a new general evaluation method for FL. Following experiments on MNIST dataset show that our method outperforms the baseline method. When the edge data is heterogeneous, the convergence speed of our method is 55% faster than that of the baseline method, and the performance is improved by 2%.

This work was supported by the NSFC under Grant 61936011, 61521002, National Key R&D Program of China (No. 2018YFB1003703), and Beijing Key Lab of Networked Multimedia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

McMahan, H.B., Moore, E., Ramage, D., et al.: Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, pp. 1273–1282 (2017)
Google Scholar
Konečný, J., Mcmahan, B., Ramage, D.: Federated optimization: distributed optimization beyond the data center. Mathematics 2(1), 115 (2015)
Google Scholar
Konečný, J., McMahan, H.B., Yu, F.X., et al.: Federated learning: strategies for improving communication efficiency (2016)
Google Scholar
Zhao, Y., Li, M., Lai, L., et al.: Federated learning with non-IID data. arXiv preprint arXiv:1806.00582 (2018)
Yosinski, J., Clune, J., Bengio, Y., et al.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, vol. 27, pp. 3320–3328 (2014)
Google Scholar
Yao, X., Huang, C., Sun, L.: Two-stream federated learning: reduce the communication costs [C/OL]. In: IEEE Visual Communications and Image Processing, VCIP 2018, Taichung, Taiwan, 9–12 December 2018, pp. 1–4. IEEE (2018). https://doi.org/10.1109/VCIP.2018.8698609
Li, T., Sahu, A.K., Zaheer, M., et al.: Federated optimization in heterogeneous networks. arXiv preprint arXiv:1812.06127 (2018)
Shoham, N., Avidor, T., Keren, A., et al.: Overcoming forgetting in federated learning on non-IID data. arXiv preprint arXiv:1910.07796 (2019)
Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. In: Proceedings of the National Academy of Sciences, vol. 114, no. 13, pp. 3521–3526 (2017)
Google Scholar
Pascanu, R., Bengio, Y.: Revisiting natural gradient for deep networks. arXiv preprint arXiv:1301.3584 (2013)
Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks, arXiv preprint arXiv:1502.02791 (2015)
Long, M., Wang, J., Ding, G., Sun, J., Philip, S.Y.: Transfer feature learning with joint distribution adaptation. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 2200–2207. IEEE (2013)
Google Scholar
Tzeng, E., Hoffman, J., Zhang, N., Saenko, K., Darrell, T.: Deep domain confusion: maximizing for domain invariance, arXiv preprint arXiv:1412.3474 (2014)
Hsu, T.M.H., Qi, H., Brown ,M.: Measuring the effects of non-identical data distribution for federated visual classification (2019)
Google Scholar
Liang, P.P., Liu, T., Ziyin, L., et al.: Think locally, act globally: federated learning with local and global representations (2020)
Google Scholar
Dosovitskiy, A., Brox, T.: Inverting visual representations with convolutional networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Zhuo, J., Wang, S., Zhang, W., Huang, Q.: Deep unsupervised convolutional domain adaptation. In: Proceedings of the 2017 ACM on Multimedia Conference, pp. 261–269. ACM (2017)
Google Scholar
Gretton, A., et al.: Optimal kernel choice for large-scale two-sample tests. In: Advances in Neural Information Processing Systems, pp. 1205–1213 (2012)
Google Scholar
Jung, H., Ju, J., Jung, M., Kim, J.: Less-forgetting learning in deep neural networks, arXiv preprint arXiv:1607.00122 (2016)

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tsinghua University, Beijing, China
Zirui Zhu & Lifeng Sun

Authors

Zirui Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Lifeng Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lifeng Sun .

Editor information

Editors and Affiliations

Charles University, Prague, Czech Republic
Jakub Lokoč
Charles University, Prague, Czech Republic
Tomáš Skopal
Klagenfurt University, Klagenfurt, Austria
Klaus Schoeffmann
CERTH-ITI, Thessaloniki, Greece
Vasileios Mezaris
Renmin University of China, Beijing, China
Xirong Li
CERTH-ITI, Thessaloniki, Greece
Stefanos Vrochidis
Queen Mary University of London, London, UK
Ioannis Patras

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, Z., Sun, L. (2021). Initialize with Mask: For More Efficient Federated Learning. In: Lokoč, J., et al. MultiMedia Modeling. MMM 2021. Lecture Notes in Computer Science(), vol 12573. Springer, Cham. https://doi.org/10.1007/978-3-030-67835-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-67835-7_10
Published: 21 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67834-0
Online ISBN: 978-3-030-67835-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics