Abstract
The problem of establishing the client’s marginal contribution is essential to any decentralised machine-learning process that relies on the participation of remote agents. The ability to detect harmful participants on an ongoing basis can constitute a significant challenge as one can obtain only a very limited amount of information from the external environment in order not to break the privacy assumption that underlies the federated learning paradigm. In this work, we present an Amplified Contribution Function - a set of aggregation operations performed on gradients received by the central orchestrator that allows to non-intrusively investigate the risk of accepting a certain set of gradients dispatched from a remote agent. Our proposed method is distinguished by a high degree of interpretability and interoperability as it supports the gross majority of the currently available federated techniques and algorithms. It is also characterised by a space and time complexity similar to that of the leave-one-out method - a common baseline for all deletion and sensitivity analytics tools.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We acknowledge the term instantaneous is an abuse of a concept here, hence the evident usage of the italics. The changes are, in fact, made over the training rounds. However, this metaphor is still handy for illustrative purposes.
- 2.
The placed threshold was defined in relation to the standard deviation of the sample, but it is possible to test both functions against a different detection threshold. We have chosen standard deviation, as it is fairly straightforward to interpret and present.
References
Ghorbani, A., Zou, J.: Data Shapley: equitable valuation of data for machine learning. http://arxiv.org/abs/1904.02868 (2019). https://doi.org/10.48550/arXiv.1904.02868
Jia, R., et al.: Towards efficient data valuation based on the shapley value. In: Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, pp. 1167-1176. PMLR (2019)
Chessa, M., Loiseau, P.: A cooperative game-theoretic approach to quantify the value of personal data in networks. In: Proceedings of the 12th Workshop on the Economics of Networks, Systems and Computation, pp. 1. ACM, Cambridge Massachusetts (2017). https://doi.org/10.1145/3106723.3106732
Shapley, L.S.: A Value for N-Person Games. RAND Corporation (1952)
Wang, T., Rausch, J., Zhang, C., Jia, R., Song, D.: A principled approach to data valuation for federated learning. In: Yang, Q., Fan, L., Yu, H. (eds.) Federated Learning: Privacy and Incentive, pp. 153–167. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63076-811
Liu, Z., Chen, Y., Yu, H., Liu, Y., Cui, L.: GTG-shapley: efficient and accurate participant contribution evaluation in federated learning. ACM Trans. Intell. Syst. Technol. 13, 60:1-60:21 (2022). https://doi.org/10.1145/3501811
Song, T., Tong, Y., Wei, S.: Profit allocation for federated learning. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 2577-2586 (2019). https://doi.org/10.1109/BigData47090.2019.9006327
Lv, H., et al.: Data-free evaluation of user contributions in federated learning. http://arxiv.org/abs/2108.10623 (2021)
Shyn, S.K., Kim, D., Kim, K.: FedCCEA: a practical approach of client contribution evaluation for federated learning. http://arxiv.org/abs/2106.02310, https://doi.org/10.48550/arXiv.2106.02310 (2021)
McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A. y: Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, pp. 1273-1282. PMLR (2017)
Reddi, S.J., et al.: Adaptive federated optimization. In: Presented at the International Conference on Learning Representations, 26 March 2022
Zhang, J., Wu, Y., Pan, R.: Incentive mechanism for horizontal federated learning based on reputation and reverse auction. In: Proceedings of the Web Conference 2021, pp. 947-956. ACM, Ljubljana Slovenia (2021). https://doi.org/10.1145/3442381.3449888
MNIST handwritten digit database, Yann LeCun, Corinna Cortes and Chris Burges. http://yann.lecun.com/exdb/mnist/, Accessed 02 Nov 2023
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. http://arxiv.org/abs/1708.07747, https://doi.org/10.48550/arXiv.1708.07747 (2017)
CIFAR-10 and CIFAR-100 datasets. https://www.cs.toronto.edu/~kriz/cifar.html, Accessed 02 Nov 2023
Acknowledgements
The research leading to these results has received funding from the European Union’s Horizon Europe Programme under the LeADS project, grant agreement no. 956562 and TANGO project, grant agreement no.101120763, CREXData project grant agreement no. 101092749. The authors wish to express their gratitude to Carlo Metta for his valuable support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zuziak, M.K., Rinzivillo, S. (2024). Amplified Contribution Analysis for Federated Learning. In: Miliou, I., Piatkowski, N., Papapetrou, P. (eds) Advances in Intelligent Data Analysis XXII. IDA 2024. Lecture Notes in Computer Science, vol 14642. Springer, Cham. https://doi.org/10.1007/978-3-031-58553-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-58553-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58555-5
Online ISBN: 978-3-031-58553-1
eBook Packages: Computer ScienceComputer Science (R0)