Abstract
Federated Learning (FL) is a machine learning framework proposed to utilize the large amount of private data of edge nodes in a distributed system. Data at different edge nodes often shows strong heterogeneity, which makes the convergence speed of federated learning slow and the trained model does not perform well at the edge. In this paper, we propose Federated Mask (FedMask) to address this problem. FedMask uses Fisher Information Matrix (FIM) as a mask when initializing the local model with the global model to retain the most important parameters for the local task in the local model. Meanwhile, FedMask uses Maximum Mean Discrepancy (MMD) constraint to avoid the instability of the training process. In addition, we propose a new general evaluation method for FL. Following experiments on MNIST dataset show that our method outperforms the baseline method. When the edge data is heterogeneous, the convergence speed of our method is 55% faster than that of the baseline method, and the performance is improved by 2%.
This work was supported by the NSFC under Grant 61936011, 61521002, National Key R&D Program of China (No. 2018YFB1003703), and Beijing Key Lab of Networked Multimedia.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
McMahan, H.B., Moore, E., Ramage, D., et al.: Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, pp. 1273–1282 (2017)
Konečný, J., Mcmahan, B., Ramage, D.: Federated optimization: distributed optimization beyond the data center. Mathematics 2(1), 115 (2015)
Konečný, J., McMahan, H.B., Yu, F.X., et al.: Federated learning: strategies for improving communication efficiency (2016)
Zhao, Y., Li, M., Lai, L., et al.: Federated learning with non-IID data. arXiv preprint arXiv:1806.00582 (2018)
Yosinski, J., Clune, J., Bengio, Y., et al.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, vol. 27, pp. 3320–3328 (2014)
Yao, X., Huang, C., Sun, L.: Two-stream federated learning: reduce the communication costs [C/OL]. In: IEEE Visual Communications and Image Processing, VCIP 2018, Taichung, Taiwan, 9–12 December 2018, pp. 1–4. IEEE (2018). https://doi.org/10.1109/VCIP.2018.8698609
Li, T., Sahu, A.K., Zaheer, M., et al.: Federated optimization in heterogeneous networks. arXiv preprint arXiv:1812.06127 (2018)
Shoham, N., Avidor, T., Keren, A., et al.: Overcoming forgetting in federated learning on non-IID data. arXiv preprint arXiv:1910.07796 (2019)
Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. In: Proceedings of the National Academy of Sciences, vol. 114, no. 13, pp. 3521–3526 (2017)
Pascanu, R., Bengio, Y.: Revisiting natural gradient for deep networks. arXiv preprint arXiv:1301.3584 (2013)
Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks, arXiv preprint arXiv:1502.02791 (2015)
Long, M., Wang, J., Ding, G., Sun, J., Philip, S.Y.: Transfer feature learning with joint distribution adaptation. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 2200–2207. IEEE (2013)
Tzeng, E., Hoffman, J., Zhang, N., Saenko, K., Darrell, T.: Deep domain confusion: maximizing for domain invariance, arXiv preprint arXiv:1412.3474 (2014)
Hsu, T.M.H., Qi, H., Brown ,M.: Measuring the effects of non-identical data distribution for federated visual classification (2019)
Liang, P.P., Liu, T., Ziyin, L., et al.: Think locally, act globally: federated learning with local and global representations (2020)
Dosovitskiy, A., Brox, T.: Inverting visual representations with convolutional networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Zhuo, J., Wang, S., Zhang, W., Huang, Q.: Deep unsupervised convolutional domain adaptation. In: Proceedings of the 2017 ACM on Multimedia Conference, pp. 261–269. ACM (2017)
Gretton, A., et al.: Optimal kernel choice for large-scale two-sample tests. In: Advances in Neural Information Processing Systems, pp. 1205–1213 (2012)
Jung, H., Ju, J., Jung, M., Kim, J.: Less-forgetting learning in deep neural networks, arXiv preprint arXiv:1607.00122 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhu, Z., Sun, L. (2021). Initialize with Mask: For More Efficient Federated Learning. In: Lokoč, J., et al. MultiMedia Modeling. MMM 2021. Lecture Notes in Computer Science(), vol 12573. Springer, Cham. https://doi.org/10.1007/978-3-030-67835-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-67835-7_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67834-0
Online ISBN: 978-3-030-67835-7
eBook Packages: Computer ScienceComputer Science (R0)