Abstract
In this paper we present new attacks against federated learning when used to train natural language text models. We illustrate the effectiveness of the attacks against the next word prediction model used in Google’s GBoard app, a widely used mobile keyboard app that has been an early adopter of federated learning for production use. We demonstrate that the words a user types on their mobile handset, e.g. when sending text messages, can be recovered with high accuracy under a wide range of conditions and that counter-measures such a use of mini-batches and adding local noise are ineffective. We also show that the word order (and so the actual sentences typed) can be reconstructed with high fidelity. This raises obvious privacy concerns, particularly since GBoard is in production use.
M. Suliman—Now at IBM Research Europe - Dublin.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
DP aims to protect the aggregate training data/model against query-based attacks, whereas our attack targets the individual updates. Nevertheless, we note that DP is sometimes suggested as a potential defence against the type of attack carried out here.
- 2.
Google’s Secure Aggregation approach [5] is a prominent example of an approach requiring trust in the server, or more specifically in the PKI infrastructure which in practice is operated by the same organisation that runs the FL server since it involves authentication/verification of clients. We note also that Secure Aggregation is not currently deployed in the GBoard app despite being proposed 6 years ago.
- 3.
It is perhaps worth noting that we studied a variety of reconstruction attacks, e.g., using Monte Carlo Tree Search to perform a smart search over all words sequences, but found the attack method described here to be simple, efficient and highly effective.
- 4.
Note that in DPSGD the added noise is multiplied by the learning rate \(\eta \), and so this factor needs to be taken into account when comparing the \(\sigma \) values used in DPSGD above and with single noise addition. This means added noise with standard deviation \(\sigma \) for DPSGD corresponds roughly to a standard deviation of \(\eta \sqrt{EB}\sigma \) with single noise addition. For \(\eta =0.001\), \(E=1000\), \(B=32\), \(\sigma =0.1\) the corresponding single noise addition standard deviation is 0.018.
References
Gboard – the Google Keyboard (2022). https://play.google.com/store/apps/details?id=com.google.android.inputmethod.latin. Accessed 24 Oct 2022
Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (2016)
Ball, J.: NSA collects millions of text messages daily in ‘untargeted’ global sweep (2014)
Boenisch, F., Dziedzic, A., Schuster, R., Shamsabadi, A.S., Shumailov, I., Papernot, N.: When the curious abandon honesty: Federated learning is not private. arXiv preprint arXiv:2112.02918 (2021)
Bonawitz, K., et al.: Practical secure aggregation for federated learning on user-held data. arXiv preprint arXiv:1611.04482 (2016)
Carlini, N., Liu, C., Erlingsson, Ú., Kos, J., Song, D.: The secret sharer: evaluating and testing unintended memorization in neural networks. In: Proceedings of the 28th USENIX Conference on Security Symposium, SEC 2019, USA, pp. 267–284. USENIX Association (2019)
Carlini, N., et al.: Extracting training data from large language models. In: 30th USENIX Security Symposium (USENIX Security 2021), pp. 2633–2650 (2021)
Deng, J., et al.: Tag: gradient attack on transformer-based language models. arXiv preprint arXiv:2103.06819 (2021)
Geiping, J., Bauermeister, H., Dröge, H., Moeller, M.: Inverting gradients - how easy is it to break privacy in federated learning? In: Advances in Neural Information Processing Systems (2020)
Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28(10), 2222–2232 (2017)
Hard, A., et al.: Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018)
Jin, X., Chen, P.-Y., Hsu, C.-Y., Yu, C.M., Chen, T.: Catastrophic data leakage in vertical federated learning. In: Advances in Neural Information Processing Systems, vol. 34 (2021)
Kairouz, P., McMahan, B., Song, S., Thakkar, O., Thakurta, A., Xu, Z.: Practical and private (deep) learning without sampling or shuffling. arXiv preprint arXiv:2103.00039 (2021)
Leith, D.J.: Mobile handset privacy: measuring the data iOS and android send to apple and google. In: Proceedings of Securecomm (2021)
Leith, D.J., Farrell, S.: Contact tracing app privacy: what data is shared by Europe’s GAEN contact tracing apps. In: Proceedings of IEEE INFOCOM (2021)
Marzal, A., Vidal, E.: Computation of normalized edit distance and applications. IEEE Trans. Pattern Anal. Mach. Intell. 15(9), 926–932 (1993)
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics (2017)
McMahan, H.B., Ramage, D., Talwar, K., Zhang, L.: Learning differentially private recurrent language models. In: International Conference on Learning Representations (2018)
O’Day, D.R., Calix, R.A.: Text message corpus: applying natural language processing to mobile device forensics. In: 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) (2013)
Pan, X., Zhang, M., Yan, Y., Zhu, J., Yang, M.: Theory-oriented deep leakage from gradients via linear equation solver. arXiv preprint arXiv:2010.13356 (2020)
Pasquini, D., Francati, D., Ateniese, G.: Eluding secure aggregation in federated learning via model inconsistency. arXiv preprint arXiv:2111.07380 (2021)
Press, O., Wolf, L.: Using the output embedding to improve language models. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Valencia, Spain, pp. 157–163. Association for Computational Linguistics (2017)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (2017)
Wang, Y., et al.: Sapag: a self-adaptive privacy attack from gradients. arXiv preprint arXiv:2009.06228 (2020)
Yin, H., Mallya, A., Vahdat, A., Alvarez, J.M., Kautz, J., Molchanov, P.: See through gradients: image batch recovery via gradinversion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16337–16346 (2021)
Zhao, B., Mopuri, K.R., Bilen, H.: idlg: improved deep leakage from gradients. arXiv preprint arXiv:2001.02610 (2020)
Zhu, J., Blaschko, M.: R-gap: recursive gradient attack on privacy. arXiv preprint arXiv:2010.07733 (2020)
Zhu, L., Liu, Z., Han, S.: Deep leakage from gradients. In: Annual Conference on Neural Information Processing Systems (NeurIPS) (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Suliman, M., Leith, D. (2024). Two Models are Better Than One: Federated Learning is Not Private for Google GBoard Next Word Prediction. In: Tsudik, G., Conti, M., Liang, K., Smaragdakis, G. (eds) Computer Security – ESORICS 2023. ESORICS 2023. Lecture Notes in Computer Science, vol 14347. Springer, Cham. https://doi.org/10.1007/978-3-031-51482-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-51482-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51481-4
Online ISBN: 978-3-031-51482-1
eBook Packages: Computer ScienceComputer Science (R0)