Abstract
Mobile internet applications on smart phones dominate large portions of daily life for many people. Conventional machine learning-based knowledge acquisition methods collect users’ data in a centralized server, then train an intelligent model, such as recommendation and prediction, using all the collected data. This knowledge acquisition method raises serious privacy concerns, and also violates the rules of the newly published General Data Protection Regulation. This paper proposes a new attention-augmented federated learning framework that can conduct decentralized knowledge acquisition for mobile Internet application scenarios, such as mobile keyboard suggestions. In particular, the attention mechanism aggregates the decentralized knowledge which has been acquired from each mobile using its own data locally. The centralized server aggregates knowledge without direct access to personal data. Experiments on three real-world datasets demonstrate that the proposed framework performs better than other baseline methods in terms of perplexity and communication cost.
Similar content being viewed by others
Notes
Penn Treebank is available at https://github.com/wojzaremba/lstm/tree/master/data
WikiText-2 is available at https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip
Available at https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/, retrieved in Dec, 2018
Reddit Comments dataset is available at https://www.kaggle.com/reddit/reddit-comments-may-2015
References
Alistarh, D., Grubic, D., Li, J., Tomioka, R., Vojnovic, M.: QSGD: Communication-efficient SGD via gradient quantization and encoding. In: Advances in Neural Information Processing Systems, pp 1709–1720 (2017)
Arnold, K.C., Gajos, K.Z., Kalai, A.T.: On suggesting phrases vs. predicting words for mobile text composition. In: Proceedings of the 29th Annual Symposium on User Interface Software and Technology, pp 603–608. ACM (2016)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 (2014)
Chen, F., Dong, Z., Li, Z., He, X.: Federated meta-learning for recommendation. arXiv:1802.07876 (2018)
Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder–decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 1724–1734 (2014)
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. arXiv:abs/1703.03400 (2017)
Geyer, R.C., Klein, T., Nabi, M.: Differentially private federated learning: A client level perspective. arXiv:1712.07557 (2017)
He, H., Watson, T., Maple, C., Mehnen, J.: As Tiwari: A new semantic attribute deep learning with a linguistic attribute hierarchy for spam detection. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp 3862–3869. IEEE (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neur. Comput. 9 (8), 1735–1780 (1997)
Inan, H., Khosravi, K., Socher, R.: Tying word vectors and word classifiers: A loss framework for language modeling. arXiv:1611.01462 (2016)
Kim, E., Lee, J.-A., Sung, Y., Choi, S.M.: Predicting selfie-posting behavior on social networking sites: An extension of theory of planned behavior. Comput. Hum. Behav. 62, 116–123 (2016)
Kim, Y., Sun, J., Yu, H., Jiang, X.: Federated tensor factorization for computational phenotyping. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 887–895. ACM (2017)
Konečnỳ, J, McMahan, H.B., Yu, F.X., Richtárik, P, Suresh, A.T., Bacon, D.: Federated learning: Strategies for improving communication efficiency. arXiv:1610.05492 (2016)
Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 1412–1421 (2015)
Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of english: The Penn Treebank. Comput. Linguis. 19(2), 313–330 (1993)
McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp 1273–1282 (2017)
Merity, S., Xiong, C., Bradbury, J., Socher, R.: Pointer sentinel mixture models. arXiv:1609.07843 (2016)
Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp 2204–2212 (2014)
Nichol, A., Achiam, J., Schulman, J.: On first-order meta-learning algorithms. arXiv:1803.02999 (2018)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Popov, V., Kudinov, M., Piontkovskaya, I., Vytovtov, P., Nevidomsky, A.: Distributed fine-tuning of language models on private data. In: International Conference on Learning Representation (ICLR) (2018)
Press, O., Wolf, L.: Using the output embedding to improve language models. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, vol. 2, pp 157–163 (2017)
Shen, T., Zhou, T., Long, G., Jiang, J., Pan, S., Zhang, C.: Disan: Directional self-attention network for rnn/cnn-free language understanding. arXiv:1709.04696 (2017)
Smith, V., Chiang, C.-K., Sanjabi, M., Talwalkar, A.S.: Federated multi-task learning. In: Advances in Neural Information Processing Systems, pp 4427–4437 (2017)
Wen, W., Xu, C., Yan, F., Wu, C., Wang, Y., Chen, Y., Li, H.: Terngrad: Ternary gradients to reduce communication in distributed deep learning. In: Advances in Neural Information Processing Systems, pp 1509–1519 (2017)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1480–1489 (2016)
Yin, W., Schütze, H, Xiang, B., Zhou, B.: Abcnn: Attention-based convolutional neural network for modeling sentence pairs. Trans. Assoc. Comput. Linguis. 4(1), 259–272 (2016)
Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv:1409.2329 (2014)
Counterpoint Research @ Statista Inc.: Time spent on smartphone everyday worldwide in 2017 (in hours). https://www.statista.com/statistics/781692/worldwide-daily-time-spent-on-smartphone/ (2017)
Ji, S., Long, G., Pan, S., Zhu, T., Jiang, J., Wang, S., Li, X.: Knowledge transferring via model aggregation for online social care. arXiv preprint arXiv:1905.07665 (2019)
Liu, L., Zhou, T., Long, G., Jiang, J., Zhang, C.: Learning to propagate for graph meta-learning. In: Advances in Neural Information Processing Systems, pp 1037–1048 (2019)
Liu, L., Zhou, T., Long, G., Jiang, J., Zhang, C.: Attribute Propagation Network for Graph Zero-shot Learning. In: Thirty-Fourth AAAI Conference on Artificial Intelligence (2020)
Shen, T., Zhou, T., Long, G., Jiang, J., Zhang, C.: Bi-directional block self-attention for fast and memory-efficient sequence modeling. In: International Conference on Learning Representations (2018)
Shen, T., Zhou, T., Long, G., Jiang, J., Pan, S., Zhang, C.: Disan: Directional self-attention network for rnn/cnn-free language understanding. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Special Issue on Application-Driven Knowledge Acquisition
Guest Editors: Xue Li, Sen Wang, and Bohan Li
Rights and permissions
About this article
Cite this article
Jiang, J., Ji, S. & Long, G. Decentralized Knowledge Acquisition for Mobile Internet Applications. World Wide Web 23, 2653–2669 (2020). https://doi.org/10.1007/s11280-019-00775-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-019-00775-w