Abstract
Breaking down data silos and promoting data circulation and cooperation is an important topic in the digital age. As data security and privacy protection have received widespread attention, the traditional cooperation model based on data centralization has been challenged. Federated learning provides technical solutions to solve this problem, but the characteristics of multi-party cooperation and data invisibility make it face the risk of data fraud. Malicious participants can manipulate data individually or in collusion to illegally obtain data or influence federated learning model. This paper proposes a novel data fraud detection method based on distributed collaborative representation and realizes the effective detection of federated learning data fraud through collaborative clustering, adaptive representation and dynamic weighting. The method proposed in this paper overcomes weakness in the existing methods that detect data fraud mechanically and statically, which cannot be organically combined with the training objectives and process. It realizes the dynamically continuous anti-collusion soft constraint detection while ensuring fraud detection and contribution evaluation are relatively independent. Our research is of great significance for federated learning to deal with the risk of data fraud and better apply to real-world scenarios.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Konečný, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: Strategies for improving communication efficiency, arXiv preprint arXiv:1610.05492 (2016)
Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM Trans. Intell. Sys. Technol. (TIST) 10(2), 1–19 (2019)
Fang, M., Cao, X., Jia, J., Gong, N.Z.: Local model poisoning attacks to byzantine-robust federated learning. In: Proceedings of the 29th USENIX Conference on Security Symposium, pp. 1623–1640 (2020)
Xie, C., Huang, K., Chen, P.-Y., Li, B.: Dba: distributed backdoor attacks against federated learning. In: International conference on learning representations (2020)
Hitaj, B., Ateniese, G., Perez-Cruz, F.: Deep models under the GAN: information leakage from collaborative deep learning. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp. 603–618 (2017)
Cao, D., Chang, S., Lin, Z., Liu, G., Sun, D.: Understanding distributed poisoning attack in federated learning. In: 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS), (IEEE, 2019), pp. 233–239 (2019)
Mahsereci, M., Balles, L., Lassner, C., Hennig, P.: Early stopping without a validation set, arXiv preprint arXiv:1703.09580 (2017)
Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. In: Proceedings of the 29 th International Conference on Machine Learning (2012)
Hu, S., Lu, J., Wan, W., Zhang, L.Y.: Challenges and approaches for mitigating byzantine attacks in federated learning, arXiv preprint arXiv:2112.14468 (2021)
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., Chandra, V.: Federated learning with non-iid data, arXiv preprint arXiv:1806.00582 (2018)
Mahloujifar, S., Mahmoody, M., Mohammed, A.: Multi-party poisoning through generalized p-tampering, arXiv preprint arXiv:1809.03474, 5 (2018)
Xie, C., Koyejo, O., Gupta, I.: Fall of empires: breaking byzantine-tolerant sgd by inner product manipulation. In: Uncertainty in Artificial Intelligence, (PMLR, 2020), pp. 261–270 (2020)
Liu, Z., Chen, Y., Yu, H., Liu, Y., Cui, L.: Gtg-shapley: Efficient and accurate participant contribution evaluation in federated learning. ACM Trans. Intelli. Sys. Technol. (TIST) 13(4), 1–21 (2022)
Konečný, J., McMahan, H.B., Ramage, D., Richtárik, P.: Federated optimization: distributed machine learning for on-device intelligence, arXiv preprint arXiv:1610.02527 (2016)
Hard, A., et al.: Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018)
Li, L., Fan, Y., Tse, M., Lin, K.-Y.: A review of applications in federated learning. Computers & Industrial Engineering, 106854 (2020)
Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: challenges, methods, and future directions. IEEE Signal Process. Mag. 37(3), 50–60 (2020)
Bonawitz, K. et al.: Towards federated learning at scale: System design, arXiv preprint arXiv:1902.01046 (2019)
Kairouz, P., et al.: Advances and open problems in federated learning, arXiv preprint arXiv:1912.04977 (2019)
McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics, (PMLR, 2017), pp. 1273–1282 (2017)
Asad, M., Moustafa, A., Ito, T.: FedOpt: towards communication efficiency and privacy preservation in federated learning. Appl. Sci. 10(8), 2864 (2020)
Yang, W., Zhang, Y., Ye, K., Li, L., Xu, C.-Z.: Ffd: a federated learning based method for credit card fraud detection. In: International conference on big data, pp. 18–32. Springer (2019)
Chen, Y., Qin, X., Wang, J., Yu, C., Gao, W.: Fedhealth: a federated transfer learning framework for wearable healthcare. IEEE Intell. Syst. 35(4), 83–93 (2020)
Brisimi, T.S., Chen, R., Mela, T., Olshevsky, A., Paschalidis, I.C., Shi, W.: Federated learning of predictive models from federated electronic health records. Int. J. Med. Informatics 112, 59–67 (2018)
Saputra, Y.M., Hoang, D.T., Nguyen, D.N., Dutkiewicz, E., Mueck, M.D., Srikanteswara, S.: Energy demand prediction with federated learning for electric vehicle networks. In: 2019 IEEE Global Communications Conference (GLOBECOM), (IEEE, 2019), pp. 1–6 (2019)
Lu, Y., Huang, X., Dai, Y., Maharjan, S., Zhang, Y.: Federated learning for data privacy preservation in vehicular cyber-physical systems. IEEE Network 34(3), 50–56 (2020)
Wu, C., Wu, F., Cao, Y., Huang, Y., Xie, X.: Fedgnn: federated graph neural network for privacy-preserving recommendation, arXiv preprint arXiv:2102.04925 (2021)
Fung, C., Yoon, C.J., Beschastnikh, I.: Mitigating sybils in federated learning poisoning, arXiv preprint arXiv:1808.04866 (2018)
He, X., Ling, Q., Chen, T.: Byzantine-robust stochastic gradient descent for distributed low-rank matrix completion. In: 2019 IEEE Data Science Workshop (DSW), (IEEE, 2019), pp. 322–326 (2019)
Jagielski, M., Oprea, A., Biggio, B., Liu, C., Nita-Rotaru, C., Li, B.: Manipulating machine learning: poisoning attacks and countermeasures for regression learning. In: 2018 IEEE symposium on security and privacy (SP), (IEEE, 2018), pp. 19–35 (2018)
Xiao, H., Biggio, B., Brown, G., Fumera, G., Eckert, C., Roli, F.: Is feature selection secure against training data poisoning?. In: international conference on machine learning, (PMLR, 2015), pp. 1689–1698 (2015)
Fung, C., Yoon, C.J., Beschastnikh, I.: The limitations of federated learning in sybil settings. In: RAID, pp. 301–316 (2020)
Gong, X., Chen, Y., Wang, Q., Kong, W.: Backdoor attacks and defenses in federated learning: state-of-the-art, taxonomy, and future directions. IEEE Wireless Communications (2022)
Blanchard, P., El Mhamdi, E.M., Guerraoui, R., Stainer, J.: Machine learning with adversaries: byzantine tolerant gradient descent. Advances in neural information processing systems 30 (2017)
Xia, Q., Tao, Z., Hao, Z., Li, Q.: FABA: an algorithm for fast aggregation against byzantine attacks in distributed neural networks. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 4824–4830 (2019)
Yin, D., Chen, Y., Kannan, R., Bartlett, P.: Byzantine-robust distributed learning: towards optimal statistical rates. In: International Conference on Machine Learning, (PMLR, 2018), pp. 5650–5659 (2018)
Xie, C., Koyejo, O., Gupta, I.: Generalized byzantine-tolerant sgd, arXiv preprint arXiv:1802.10116 (2018)
Xie, C., Koyejo, S., Gupta, I.: Zeno: distributed stochastic gradient descent with suspicion-based fault-tolerance. In: International Conference on Machine Learning, (PMLR, 2019), pp. 6893–6901 (2019)
Park, J., Han, D.-J., Choi, M., Moon, J.: Sageflow: robust federated learning against both stragglers and adversaries. Adv. Neural. Inf. Process. Syst. 34, 840–851 (2021)
Li, S., Cheng, Y., Liu, Y., Wang, W., Chen, T.: Abnormal client behavior detection in federated learning, arXiv preprint arXiv:1910.09933 (2019)
Li, L., Xu, W., Chen, T., Giannakis, G.B., Ling, Q.: RSA: Byzantine-robust stochastic aggregation methods for distributed learning from heterogeneous datasets. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1544–1551 (2019)
Sun, Z., Kairouz, P., Suresh, A.T., McMahan, H.B.: Can you really backdoor federated learning?, arXiv preprint arXiv:1911.07963 (2019)
Wu, C., Yang, X., Zhu, S., Mitra, P.: Mitigating backdoor attacks in federated learning, arXiv preprint arXiv:2011.01767 (2020)
Smith, V., Chiang, C.-K., Sanjabi, M., Talwalkar, A.S.: Federated multi-task learning. Advances in neural information processing systems 30 (2017)
Nguyen, D.C., et al.: Federated learning for smart healthcare: a survey. ACM Computing Surveys (CSUR) 55(3), 1–37 (2022)
Bonawitz, K., et al.: Practical secure aggregation for federated learning on user-held data, arXiv preprint arXiv:1611.04482 (2016)
Geyer, R.C., Klein, T., Nabi, M.: Differentially private federated learning: a client level perspective, arXiv preprint arXiv:1712.07557 (2017)
Ohrimenko, O.: et al.: Oblivious {Multi-Party} machine learning on trusted processors. In: 25th USENIX Security Symposium (USENIX Security 16), pp. 619–636 (2016)
Li, X., Qu, Z., Zhao, S., Tang, B., Lu, Z., Liu, Y.: Lomar: a local defense against poisoning attack on federated learning. IEEE Transactions on Dependable and Secure Computing (2021)
Liang, G., Chawathe, S.S.: Privacy-preserving inter-database operations. In: International Conference on Intelligence and Security Informatics, pp. 66–82. Springer (2004)
Scannapieco, M., Figotin, I., Bertino, E., Elmagarmid, A.K.: Privacy preserving schema and data matching. In: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pp. 653–664 (2007)
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (grant numbers 72271059, 71971067). Chenghong Zhang is the corresponding author.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, Z., Zhang, C., Chen, G., Xiao, S., Huang, L. (2023). Distinguishing Good from Bad: Distributed-Collaborative-Representation-Based Data Fraud Detection in Federated Learning. In: Nah, F., Siau, K. (eds) HCI in Business, Government and Organizations. HCII 2023. Lecture Notes in Computer Science, vol 14039. Springer, Cham. https://doi.org/10.1007/978-3-031-36049-7_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-36049-7_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36048-0
Online ISBN: 978-3-031-36049-7
eBook Packages: Computer ScienceComputer Science (R0)