Distinguishing Good from Bad: Distributed-Collaborative-Representation-Based Data Fraud Detection in Federated Learning

Zhang, Zongxiang; Zhang, Chenghong; Chen, Gang; Xiao, Shuaiyong; Huang, Lihua

doi:10.1007/978-3-031-36049-7_19

Zongxiang Zhang⁹,
Chenghong Zhang⁹,
Gang Chen¹⁰,
Shuaiyong Xiao¹¹ &
…
Lihua Huang⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14039))

Included in the following conference series:

International Conference on Human-Computer Interaction

1053 Accesses

Abstract

Breaking down data silos and promoting data circulation and cooperation is an important topic in the digital age. As data security and privacy protection have received widespread attention, the traditional cooperation model based on data centralization has been challenged. Federated learning provides technical solutions to solve this problem, but the characteristics of multi-party cooperation and data invisibility make it face the risk of data fraud. Malicious participants can manipulate data individually or in collusion to illegally obtain data or influence federated learning model. This paper proposes a novel data fraud detection method based on distributed collaborative representation and realizes the effective detection of federated learning data fraud through collaborative clustering, adaptive representation and dynamic weighting. The method proposed in this paper overcomes weakness in the existing methods that detect data fraud mechanically and statically, which cannot be organically combined with the training objectives and process. It realizes the dynamically continuous anti-collusion soft constraint detection while ensuring fraud detection and contribution evaluation are relatively independent. Our research is of great significance for federated learning to deal with the risk of data fraud and better apply to real-world scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Konečný, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: Strategies for improving communication efficiency, arXiv preprint arXiv:1610.05492 (2016)
Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM Trans. Intell. Sys. Technol. (TIST) 10(2), 1–19 (2019)
Article Google Scholar
Fang, M., Cao, X., Jia, J., Gong, N.Z.: Local model poisoning attacks to byzantine-robust federated learning. In: Proceedings of the 29th USENIX Conference on Security Symposium, pp. 1623–1640 (2020)
Google Scholar
Xie, C., Huang, K., Chen, P.-Y., Li, B.: Dba: distributed backdoor attacks against federated learning. In: International conference on learning representations (2020)
Google Scholar
Hitaj, B., Ateniese, G., Perez-Cruz, F.: Deep models under the GAN: information leakage from collaborative deep learning. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp. 603–618 (2017)
Google Scholar
Cao, D., Chang, S., Lin, Z., Liu, G., Sun, D.: Understanding distributed poisoning attack in federated learning. In: 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS), (IEEE, 2019), pp. 233–239 (2019)
Google Scholar
Mahsereci, M., Balles, L., Lassner, C., Hennig, P.: Early stopping without a validation set, arXiv preprint arXiv:1703.09580 (2017)
Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. In: Proceedings of the 29 th International Conference on Machine Learning (2012)
Google Scholar
Hu, S., Lu, J., Wan, W., Zhang, L.Y.: Challenges and approaches for mitigating byzantine attacks in federated learning, arXiv preprint arXiv:2112.14468 (2021)
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., Chandra, V.: Federated learning with non-iid data, arXiv preprint arXiv:1806.00582 (2018)
Mahloujifar, S., Mahmoody, M., Mohammed, A.: Multi-party poisoning through generalized p-tampering, arXiv preprint arXiv:1809.03474, 5 (2018)
Xie, C., Koyejo, O., Gupta, I.: Fall of empires: breaking byzantine-tolerant sgd by inner product manipulation. In: Uncertainty in Artificial Intelligence, (PMLR, 2020), pp. 261–270 (2020)
Google Scholar
Liu, Z., Chen, Y., Yu, H., Liu, Y., Cui, L.: Gtg-shapley: Efficient and accurate participant contribution evaluation in federated learning. ACM Trans. Intelli. Sys. Technol. (TIST) 13(4), 1–21 (2022)
Google Scholar
Konečný, J., McMahan, H.B., Ramage, D., Richtárik, P.: Federated optimization: distributed machine learning for on-device intelligence, arXiv preprint arXiv:1610.02527 (2016)
Hard, A., et al.: Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018)
Li, L., Fan, Y., Tse, M., Lin, K.-Y.: A review of applications in federated learning. Computers & Industrial Engineering, 106854 (2020)
Google Scholar
Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: challenges, methods, and future directions. IEEE Signal Process. Mag. 37(3), 50–60 (2020)
Article Google Scholar
Bonawitz, K. et al.: Towards federated learning at scale: System design, arXiv preprint arXiv:1902.01046 (2019)
Kairouz, P., et al.: Advances and open problems in federated learning, arXiv preprint arXiv:1912.04977 (2019)
McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics, (PMLR, 2017), pp. 1273–1282 (2017)
Google Scholar
Asad, M., Moustafa, A., Ito, T.: FedOpt: towards communication efficiency and privacy preservation in federated learning. Appl. Sci. 10(8), 2864 (2020)
Article Google Scholar
Yang, W., Zhang, Y., Ye, K., Li, L., Xu, C.-Z.: Ffd: a federated learning based method for credit card fraud detection. In: International conference on big data, pp. 18–32. Springer (2019)
Google Scholar
Chen, Y., Qin, X., Wang, J., Yu, C., Gao, W.: Fedhealth: a federated transfer learning framework for wearable healthcare. IEEE Intell. Syst. 35(4), 83–93 (2020)
Article Google Scholar
Brisimi, T.S., Chen, R., Mela, T., Olshevsky, A., Paschalidis, I.C., Shi, W.: Federated learning of predictive models from federated electronic health records. Int. J. Med. Informatics 112, 59–67 (2018)
Article Google Scholar
Saputra, Y.M., Hoang, D.T., Nguyen, D.N., Dutkiewicz, E., Mueck, M.D., Srikanteswara, S.: Energy demand prediction with federated learning for electric vehicle networks. In: 2019 IEEE Global Communications Conference (GLOBECOM), (IEEE, 2019), pp. 1–6 (2019)
Google Scholar
Lu, Y., Huang, X., Dai, Y., Maharjan, S., Zhang, Y.: Federated learning for data privacy preservation in vehicular cyber-physical systems. IEEE Network 34(3), 50–56 (2020)
Article Google Scholar
Wu, C., Wu, F., Cao, Y., Huang, Y., Xie, X.: Fedgnn: federated graph neural network for privacy-preserving recommendation, arXiv preprint arXiv:2102.04925 (2021)
Fung, C., Yoon, C.J., Beschastnikh, I.: Mitigating sybils in federated learning poisoning, arXiv preprint arXiv:1808.04866 (2018)
He, X., Ling, Q., Chen, T.: Byzantine-robust stochastic gradient descent for distributed low-rank matrix completion. In: 2019 IEEE Data Science Workshop (DSW), (IEEE, 2019), pp. 322–326 (2019)
Google Scholar
Jagielski, M., Oprea, A., Biggio, B., Liu, C., Nita-Rotaru, C., Li, B.: Manipulating machine learning: poisoning attacks and countermeasures for regression learning. In: 2018 IEEE symposium on security and privacy (SP), (IEEE, 2018), pp. 19–35 (2018)
Google Scholar
Xiao, H., Biggio, B., Brown, G., Fumera, G., Eckert, C., Roli, F.: Is feature selection secure against training data poisoning?. In: international conference on machine learning, (PMLR, 2015), pp. 1689–1698 (2015)
Google Scholar
Fung, C., Yoon, C.J., Beschastnikh, I.: The limitations of federated learning in sybil settings. In: RAID, pp. 301–316 (2020)
Google Scholar
Gong, X., Chen, Y., Wang, Q., Kong, W.: Backdoor attacks and defenses in federated learning: state-of-the-art, taxonomy, and future directions. IEEE Wireless Communications (2022)
Google Scholar
Blanchard, P., El Mhamdi, E.M., Guerraoui, R., Stainer, J.: Machine learning with adversaries: byzantine tolerant gradient descent. Advances in neural information processing systems 30 (2017)
Google Scholar
Xia, Q., Tao, Z., Hao, Z., Li, Q.: FABA: an algorithm for fast aggregation against byzantine attacks in distributed neural networks. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 4824–4830 (2019)
Google Scholar
Yin, D., Chen, Y., Kannan, R., Bartlett, P.: Byzantine-robust distributed learning: towards optimal statistical rates. In: International Conference on Machine Learning, (PMLR, 2018), pp. 5650–5659 (2018)
Google Scholar
Xie, C., Koyejo, O., Gupta, I.: Generalized byzantine-tolerant sgd, arXiv preprint arXiv:1802.10116 (2018)
Xie, C., Koyejo, S., Gupta, I.: Zeno: distributed stochastic gradient descent with suspicion-based fault-tolerance. In: International Conference on Machine Learning, (PMLR, 2019), pp. 6893–6901 (2019)
Google Scholar
Park, J., Han, D.-J., Choi, M., Moon, J.: Sageflow: robust federated learning against both stragglers and adversaries. Adv. Neural. Inf. Process. Syst. 34, 840–851 (2021)
Google Scholar
Li, S., Cheng, Y., Liu, Y., Wang, W., Chen, T.: Abnormal client behavior detection in federated learning, arXiv preprint arXiv:1910.09933 (2019)
Li, L., Xu, W., Chen, T., Giannakis, G.B., Ling, Q.: RSA: Byzantine-robust stochastic aggregation methods for distributed learning from heterogeneous datasets. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1544–1551 (2019)
Google Scholar
Sun, Z., Kairouz, P., Suresh, A.T., McMahan, H.B.: Can you really backdoor federated learning?, arXiv preprint arXiv:1911.07963 (2019)
Wu, C., Yang, X., Zhu, S., Mitra, P.: Mitigating backdoor attacks in federated learning, arXiv preprint arXiv:2011.01767 (2020)
Smith, V., Chiang, C.-K., Sanjabi, M., Talwalkar, A.S.: Federated multi-task learning. Advances in neural information processing systems 30 (2017)
Google Scholar
Nguyen, D.C., et al.: Federated learning for smart healthcare: a survey. ACM Computing Surveys (CSUR) 55(3), 1–37 (2022)
Article Google Scholar
Bonawitz, K., et al.: Practical secure aggregation for federated learning on user-held data, arXiv preprint arXiv:1611.04482 (2016)
Geyer, R.C., Klein, T., Nabi, M.: Differentially private federated learning: a client level perspective, arXiv preprint arXiv:1712.07557 (2017)
Ohrimenko, O.: et al.: Oblivious {Multi-Party} machine learning on trusted processors. In: 25th USENIX Security Symposium (USENIX Security 16), pp. 619–636 (2016)
Google Scholar
Li, X., Qu, Z., Zhao, S., Tang, B., Lu, Z., Liu, Y.: Lomar: a local defense against poisoning attack on federated learning. IEEE Transactions on Dependable and Secure Computing (2021)
Google Scholar
Liang, G., Chawathe, S.S.: Privacy-preserving inter-database operations. In: International Conference on Intelligence and Security Informatics, pp. 66–82. Springer (2004)
Google Scholar
Scannapieco, M., Figotin, I., Bertino, E., Elmagarmid, A.K.: Privacy preserving schema and data matching. In: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pp. 653–664 (2007)
Google Scholar

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (grant numbers 72271059, 71971067). Chenghong Zhang is the corresponding author.

Author information

Authors and Affiliations

School of Management, Fudan University, Shanghai, 200433, People’s Republic of China
Zongxiang Zhang, Chenghong Zhang & Lihua Huang
School of Management, Zhejiang University, Hangzhou, 310058, People’s Republic of China
Gang Chen
School of Economics and Management, Tongji University, Shanghai, 200092, People’s Republic of China
Shuaiyong Xiao

Authors

Zongxiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chenghong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Gang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shuaiyong Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Lihua Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chenghong Zhang .

Editor information

Editors and Affiliations

City University of Hong Kong, Kowloon, Hong Kong
Fiona Nah
City University of Hong Kong, Kowloon, Hong Kong
Keng Siau

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Z., Zhang, C., Chen, G., Xiao, S., Huang, L. (2023). Distinguishing Good from Bad: Distributed-Collaborative-Representation-Based Data Fraud Detection in Federated Learning. In: Nah, F., Siau, K. (eds) HCI in Business, Government and Organizations. HCII 2023. Lecture Notes in Computer Science, vol 14039. Springer, Cham. https://doi.org/10.1007/978-3-031-36049-7_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-36049-7_19
Published: 17 July 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36048-0
Online ISBN: 978-3-031-36049-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics