Skip to main content

Distinguishing Good from Bad: Distributed-Collaborative-Representation-Based Data Fraud Detection in Federated Learning

  • Conference paper
  • First Online:
HCI in Business, Government and Organizations (HCII 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14039))

Included in the following conference series:

  • 1053 Accesses

Abstract

Breaking down data silos and promoting data circulation and cooperation is an important topic in the digital age. As data security and privacy protection have received widespread attention, the traditional cooperation model based on data centralization has been challenged. Federated learning provides technical solutions to solve this problem, but the characteristics of multi-party cooperation and data invisibility make it face the risk of data fraud. Malicious participants can manipulate data individually or in collusion to illegally obtain data or influence federated learning model. This paper proposes a novel data fraud detection method based on distributed collaborative representation and realizes the effective detection of federated learning data fraud through collaborative clustering, adaptive representation and dynamic weighting. The method proposed in this paper overcomes weakness in the existing methods that detect data fraud mechanically and statically, which cannot be organically combined with the training objectives and process. It realizes the dynamically continuous anti-collusion soft constraint detection while ensuring fraud detection and contribution evaluation are relatively independent. Our research is of great significance for federated learning to deal with the risk of data fraud and better apply to real-world scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Konečný, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: Strategies for improving communication efficiency, arXiv preprint arXiv:1610.05492 (2016)

  2. Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM Trans. Intell. Sys. Technol. (TIST) 10(2), 1–19 (2019)

    Article  Google Scholar 

  3. Fang, M., Cao, X., Jia, J., Gong, N.Z.: Local model poisoning attacks to byzantine-robust federated learning. In: Proceedings of the 29th USENIX Conference on Security Symposium, pp. 1623–1640 (2020)

    Google Scholar 

  4. Xie, C., Huang, K., Chen, P.-Y., Li, B.: Dba: distributed backdoor attacks against federated learning. In: International conference on learning representations (2020)

    Google Scholar 

  5. Hitaj, B., Ateniese, G., Perez-Cruz, F.: Deep models under the GAN: information leakage from collaborative deep learning. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp. 603–618 (2017)

    Google Scholar 

  6. Cao, D., Chang, S., Lin, Z., Liu, G., Sun, D.: Understanding distributed poisoning attack in federated learning. In: 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS), (IEEE, 2019), pp. 233–239 (2019)

    Google Scholar 

  7. Mahsereci, M., Balles, L., Lassner, C., Hennig, P.: Early stopping without a validation set, arXiv preprint arXiv:1703.09580 (2017)

  8. Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. In: Proceedings of the 29 th International Conference on Machine Learning (2012)

    Google Scholar 

  9. Hu, S., Lu, J., Wan, W., Zhang, L.Y.: Challenges and approaches for mitigating byzantine attacks in federated learning, arXiv preprint arXiv:2112.14468 (2021)

  10. Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., Chandra, V.: Federated learning with non-iid data, arXiv preprint arXiv:1806.00582 (2018)

  11. Mahloujifar, S., Mahmoody, M., Mohammed, A.: Multi-party poisoning through generalized p-tampering, arXiv preprint arXiv:1809.03474, 5 (2018)

  12. Xie, C., Koyejo, O., Gupta, I.: Fall of empires: breaking byzantine-tolerant sgd by inner product manipulation. In: Uncertainty in Artificial Intelligence, (PMLR, 2020), pp. 261–270 (2020)

    Google Scholar 

  13. Liu, Z., Chen, Y., Yu, H., Liu, Y., Cui, L.: Gtg-shapley: Efficient and accurate participant contribution evaluation in federated learning. ACM Trans. Intelli. Sys. Technol. (TIST) 13(4), 1–21 (2022)

    Google Scholar 

  14. Konečný, J., McMahan, H.B., Ramage, D., Richtárik, P.: Federated optimization: distributed machine learning for on-device intelligence, arXiv preprint arXiv:1610.02527 (2016)

  15. Hard, A., et al.: Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018)

  16. Li, L., Fan, Y., Tse, M., Lin, K.-Y.: A review of applications in federated learning. Computers & Industrial Engineering, 106854 (2020)

    Google Scholar 

  17. Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: challenges, methods, and future directions. IEEE Signal Process. Mag. 37(3), 50–60 (2020)

    Article  Google Scholar 

  18. Bonawitz, K. et al.: Towards federated learning at scale: System design, arXiv preprint arXiv:1902.01046 (2019)

  19. Kairouz, P., et al.: Advances and open problems in federated learning, arXiv preprint arXiv:1912.04977 (2019)

  20. McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics, (PMLR, 2017), pp. 1273–1282 (2017)

    Google Scholar 

  21. Asad, M., Moustafa, A., Ito, T.: FedOpt: towards communication efficiency and privacy preservation in federated learning. Appl. Sci. 10(8), 2864 (2020)

    Article  Google Scholar 

  22. Yang, W., Zhang, Y., Ye, K., Li, L., Xu, C.-Z.: Ffd: a federated learning based method for credit card fraud detection. In: International conference on big data, pp. 18–32. Springer (2019)

    Google Scholar 

  23. Chen, Y., Qin, X., Wang, J., Yu, C., Gao, W.: Fedhealth: a federated transfer learning framework for wearable healthcare. IEEE Intell. Syst. 35(4), 83–93 (2020)

    Article  Google Scholar 

  24. Brisimi, T.S., Chen, R., Mela, T., Olshevsky, A., Paschalidis, I.C., Shi, W.: Federated learning of predictive models from federated electronic health records. Int. J. Med. Informatics 112, 59–67 (2018)

    Article  Google Scholar 

  25. Saputra, Y.M., Hoang, D.T., Nguyen, D.N., Dutkiewicz, E., Mueck, M.D., Srikanteswara, S.: Energy demand prediction with federated learning for electric vehicle networks. In: 2019 IEEE Global Communications Conference (GLOBECOM), (IEEE, 2019), pp. 1–6 (2019)

    Google Scholar 

  26. Lu, Y., Huang, X., Dai, Y., Maharjan, S., Zhang, Y.: Federated learning for data privacy preservation in vehicular cyber-physical systems. IEEE Network 34(3), 50–56 (2020)

    Article  Google Scholar 

  27. Wu, C., Wu, F., Cao, Y., Huang, Y., Xie, X.: Fedgnn: federated graph neural network for privacy-preserving recommendation, arXiv preprint arXiv:2102.04925 (2021)

  28. Fung, C., Yoon, C.J., Beschastnikh, I.: Mitigating sybils in federated learning poisoning, arXiv preprint arXiv:1808.04866 (2018)

  29. He, X., Ling, Q., Chen, T.: Byzantine-robust stochastic gradient descent for distributed low-rank matrix completion. In: 2019 IEEE Data Science Workshop (DSW), (IEEE, 2019), pp. 322–326 (2019)

    Google Scholar 

  30. Jagielski, M., Oprea, A., Biggio, B., Liu, C., Nita-Rotaru, C., Li, B.: Manipulating machine learning: poisoning attacks and countermeasures for regression learning. In: 2018 IEEE symposium on security and privacy (SP), (IEEE, 2018), pp. 19–35 (2018)

    Google Scholar 

  31. Xiao, H., Biggio, B., Brown, G., Fumera, G., Eckert, C., Roli, F.: Is feature selection secure against training data poisoning?. In: international conference on machine learning, (PMLR, 2015), pp. 1689–1698 (2015)

    Google Scholar 

  32. Fung, C., Yoon, C.J., Beschastnikh, I.: The limitations of federated learning in sybil settings. In: RAID, pp. 301–316 (2020)

    Google Scholar 

  33. Gong, X., Chen, Y., Wang, Q., Kong, W.: Backdoor attacks and defenses in federated learning: state-of-the-art, taxonomy, and future directions. IEEE Wireless Communications (2022)

    Google Scholar 

  34. Blanchard, P., El Mhamdi, E.M., Guerraoui, R., Stainer, J.: Machine learning with adversaries: byzantine tolerant gradient descent. Advances in neural information processing systems 30 (2017)

    Google Scholar 

  35. Xia, Q., Tao, Z., Hao, Z., Li, Q.: FABA: an algorithm for fast aggregation against byzantine attacks in distributed neural networks. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 4824–4830 (2019)

    Google Scholar 

  36. Yin, D., Chen, Y., Kannan, R., Bartlett, P.: Byzantine-robust distributed learning: towards optimal statistical rates. In: International Conference on Machine Learning, (PMLR, 2018), pp. 5650–5659 (2018)

    Google Scholar 

  37. Xie, C., Koyejo, O., Gupta, I.: Generalized byzantine-tolerant sgd, arXiv preprint arXiv:1802.10116 (2018)

  38. Xie, C., Koyejo, S., Gupta, I.: Zeno: distributed stochastic gradient descent with suspicion-based fault-tolerance. In: International Conference on Machine Learning, (PMLR, 2019), pp. 6893–6901 (2019)

    Google Scholar 

  39. Park, J., Han, D.-J., Choi, M., Moon, J.: Sageflow: robust federated learning against both stragglers and adversaries. Adv. Neural. Inf. Process. Syst. 34, 840–851 (2021)

    Google Scholar 

  40. Li, S., Cheng, Y., Liu, Y., Wang, W., Chen, T.: Abnormal client behavior detection in federated learning, arXiv preprint arXiv:1910.09933 (2019)

  41. Li, L., Xu, W., Chen, T., Giannakis, G.B., Ling, Q.: RSA: Byzantine-robust stochastic aggregation methods for distributed learning from heterogeneous datasets. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1544–1551 (2019)

    Google Scholar 

  42. Sun, Z., Kairouz, P., Suresh, A.T., McMahan, H.B.: Can you really backdoor federated learning?, arXiv preprint arXiv:1911.07963 (2019)

  43. Wu, C., Yang, X., Zhu, S., Mitra, P.: Mitigating backdoor attacks in federated learning, arXiv preprint arXiv:2011.01767 (2020)

  44. Smith, V., Chiang, C.-K., Sanjabi, M., Talwalkar, A.S.: Federated multi-task learning. Advances in neural information processing systems 30 (2017)

    Google Scholar 

  45. Nguyen, D.C., et al.: Federated learning for smart healthcare: a survey. ACM Computing Surveys (CSUR) 55(3), 1–37 (2022)

    Article  Google Scholar 

  46. Bonawitz, K., et al.: Practical secure aggregation for federated learning on user-held data, arXiv preprint arXiv:1611.04482 (2016)

  47. Geyer, R.C., Klein, T., Nabi, M.: Differentially private federated learning: a client level perspective, arXiv preprint arXiv:1712.07557 (2017)

  48. Ohrimenko, O.: et al.: Oblivious {Multi-Party} machine learning on trusted processors. In: 25th USENIX Security Symposium (USENIX Security 16), pp. 619–636 (2016)

    Google Scholar 

  49. Li, X., Qu, Z., Zhao, S., Tang, B., Lu, Z., Liu, Y.: Lomar: a local defense against poisoning attack on federated learning. IEEE Transactions on Dependable and Secure Computing (2021)

    Google Scholar 

  50. Liang, G., Chawathe, S.S.: Privacy-preserving inter-database operations. In: International Conference on Intelligence and Security Informatics, pp. 66–82. Springer (2004)

    Google Scholar 

  51. Scannapieco, M., Figotin, I., Bertino, E., Elmagarmid, A.K.: Privacy preserving schema and data matching. In: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pp. 653–664 (2007)

    Google Scholar 

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (grant numbers 72271059, 71971067). Chenghong Zhang is the corresponding author.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chenghong Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Z., Zhang, C., Chen, G., Xiao, S., Huang, L. (2023). Distinguishing Good from Bad: Distributed-Collaborative-Representation-Based Data Fraud Detection in Federated Learning. In: Nah, F., Siau, K. (eds) HCI in Business, Government and Organizations. HCII 2023. Lecture Notes in Computer Science, vol 14039. Springer, Cham. https://doi.org/10.1007/978-3-031-36049-7_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36049-7_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36048-0

  • Online ISBN: 978-3-031-36049-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics