A graph-powered large-scale fraud detection system

Li, Zhao; Wang, Biao; Huang, Jiaming; Jin, Yilun; Xu, Zenghui; Zhang, Ji; Gao, Jianliang

doi:10.1007/s13042-023-01786-w

A graph-powered large-scale fraud detection system

Original Article
Published: 14 February 2023

Volume 15, pages 115–128, (2024)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Zhao Li^1,2,
Biao Wang²,
Jiaming Huang³,
Yilun Jin^2,6,
Zenghui Xu²,
Ji Zhang⁴ &
…
Jianliang Gao⁵

515 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Graph-powered fraud detection is a common issue in various areas, such as e-commerce, banking, insurance and social networks, where data can be naturally formulated as graph structure. Especially in e-commerce, due to its large scale and enormous amount of real-time transactions over millions of merchandises, fraud detection has become an important and serious problem. The challenges lie in three aspects: sparse fraud samples, complex features in online transactions and extra-large scale of e-commerce data. To deal with above issues, in this paper, we propose an efficient graph-powered large-scale fraud detection framework. Concretely, we first present a heterogeneous label propagation algorithm to recall more potentially fraudulent samples for further model training; then, we design a novel multi-view heterogeneous graph neural network model to obtain more accurate fraud predictions; finally, a fraud pattern analysis approach is presented to discover hidden fraud groups. In addition, in order to improve the efficiency and scalability of our proposed fraud detection framework, we present a large-scale fraud detection system deployed on a general graph computing engine. We conduct experiments on two real-world datasets. Results show that the proposed graph-powered fraud detection framework achieves high accuracy and superior scalability on large-scale graph data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

IDGL: An Imbalanced Disassortative Graph Learning Framework for Fraud Detection

Improving fraud detection via imbalanced graph structure learning

Article 29 November 2023

Purchase Pattern Based Anti-Fraud Framework in Online E-Commerce Platform Using Graph Neural Network

Notes

https://www.taobao.com.
Data Availability Statements: The Mooc data that supports the findings of this study is available from https://snap.stanford.edu/jodie/#datasets. The Taobao data that supports the findings of this study is not openly available due to commercial regularities, but may be partially open upon reasonable request and under the permission of Alibaba in the future.

References

Xu H, Liu D, Wang H, Stavrou A (2015) E-commerce reputation manipulation: the emergence of reputation-escalation-as-a-service. In: Proceedings of the 24th international conference on world wide web, pp 1296–1306
Guo Q, Li Z, An B, Hui P, Huang J, Zhang L, Zhao M (2019) Securing the deep fraud detector in large-scale e-commerce platform via adversarial machine learning approach. In: The world wide web conference, pp 616–626
Wang H, Li Z, Huang J, Hui P, Liu W, Hu T, Chen G (2020) Collaboration based multi-label propagation for fraud detection. In: IJCAI
Weng H, Li Z, Ji S, Chu C, Lu H, Du T, He Q (2018) Online e-commerce fraud: a large-scale detection and analysis. In: 2018 IEEE 34th international conference on data engineering. IEEE, pp 1435–1440
Zhao M, Li Z, An B, Lu H, Yang Y, Chu C (2018) Impression allocation for combating fraud in e-commerce via deep reinforcement learning with action norm penalty. In: IJCAI, pp 3940–3946
Zheng L, Li Z, Li J, Li Z, Gao J (2019) Addgraph: anomaly detection in dynamic graph using attention-based temporal gcn. In: IJCAI, pp 4419–4425
Xu H, Li Z, Chu C, Chen Y, Yang Y, Lu H, Wang H, Stavrou A (2018) Detecting and characterizing web bot traffic in a large e-commerce marketplace. In: European symposium on research in computer security. Springer, pp 143–163
Xing Y, Li Z, Hui P, Huang J, Chen X, Zhang L, Yu G (2020) Link inference via heterogeneous multi-view graph neural networks. In: International conference on database systems for advanced applications. Springer, pp 698–706
Liu Z, Chen C, Yang X, Zhou J, Li X, Song L (2018) Heterogeneous graph neural networks for malicious account detection. In: Proceedings of the 27th ACM international conference on information and knowledge management, pp 2077–2085
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Advances in neural information processing systems, pp 1024–1034
Zhang Z, Yang H, Bu J, Zhou S, Yu P, Zhang J, Ester M, Wang C (2018) Anrl: attributed network representation learning via deep neural networks. In: IJCAI, vol 18, pp 3155–3161
Weng H, Ji S, Duan F, Li Z, Chen J, He Q, Wang T (2019) Cats: cross-platform e-commerce fraud detection. In: 2019 IEEE 35th international conference on data engineering (ICDE). IEEE, pp 1874–1885
Li Z, Hui P, Zhang P, Huang J, Wang B, Tian L, Zhang J, Gao J, Tang X (2021) What happens behind the scene? towards fraud community detection in e-commerce from online to offline. In: Companion proceedings of the web conference 2021, pp 105–113
Su N, Liu Y, Li Z, Liu Y, Zhang M, Ma S (2018) Detecting crowdturfing” add to favorites” activities in online shopping. In: Proceedings of the 2018 world wide web conference, pp 1673–1682
Li Z, Song J, Hu S, Ruan S, Zhang L, Hu Z, Gao J (2019) Fair: fraud aware impression regulation system in large-scale real-time e-commerce search platform. In: 2019 IEEE 35th international conference on data engineering (ICDE). IEEE, pp 1898–1903
Huang J, Xie Y, Yu F, Ke Q, Abadi M, Gillum E, Mao Z.M (2013) Socialwatch: detection of online service abuse via large-scale social graphs. In: Proceedings of the 8th ACM SIGSAC symposium on information, computer and communications security, pp 143–148
Cao S, Yang X, Chen C, Zhou J, Li X, Qi Y (2019) Titant: online real-time transaction fraud detection in ant financial. Proc VLDB Endowm 12(12):2082–2093
Article Google Scholar
Li X, Liu S, Li Z, Han X, Shi C, Hooi B, Huang H, Cheng X (2020) Flowscope: spotting money laundering based on graphs. In: AAAI, pp 4731–4738
Tan R, Tan Q, Zhang P, Li Z (2021) Graph neural network for ethereum fraud detection. In: 2021 IEEE international conference on big knowledge (ICBK). IEEE, pp 78–85
Mao R, Li Z, Fu J (2015) Fraud transaction recognition: a money flow network approach. In: Proceedings of the 24th ACM international on conference on information and knowledge management, pp 1871–1874
Peng H, Zhang R, Dou Y, Yang R, Zhang J, Yu PS (2021) Reinforced neighborhood selection guided multi-relational graph neural networks. ACM Trans Inf Syst 40(4):1–46
Article Google Scholar
Oentaryo R, Lim E-P, Finegold M, Lo D, Zhu F, Phua C, Cheu E-Y, Yap G-E, Sim K, Nguyen MN et al (2014) Detecting click fraud in online advertising: a data mining approach. J Mach Learn Res 15(1):99–140
MathSciNet Google Scholar
Tang J, Tian Y, Zhang P, Liu X (2018) Multiview privileged support vector machines. IEEE Trans Neural Netw Learn Syst 29(8):3463–3477
Article MathSciNet Google Scholar
Carcillo F, Dal Pozzolo A, Le Borgne Y-A, Caelen O, Mazzer Y, Bontempi G (2018) Scarff: a scalable framework for streaming credit card fraud detection with spark. Inf Fus 41:182–194
Article Google Scholar
Ma R, Miao J, Niu L, Zhang P (2019) Transformed l1 regularization for learning sparse deep neural networks. Neural Netw 119:286–298
Article Google Scholar
Gao Y, Yang H, Zhang P, Zhou C, Hu Y (2020) Graph neural architecture search. In: IJCAI, vol 20, pp 1403–1409
Yang H, Chen L, Lei M, Niu L, Zhou C, Zhang P (2020) Discrete embedding for latent networks. In: Proceedings of the twenty-ninth international joint conference on artificial intelligence, IJCAI-20, pp 1223–1229
Wang D, Lin J, Cui P, Jia Q, Wang Z, Fang Y, Yu Q, Zhou J, Yang S, Qi Y (2019) A semi-supervised graph attentive network for financial fraud detection. In: 2019 IEEE international conference on data mining (ICDM). IEEE, pp 598–607
Yao K, Liang J, Liang J, Li M, Cao F (2022) Multi-view graph convolutional networks with attention mechanism. Artif Intell 307:103708
Article MathSciNet Google Scholar
Song Y, Ye H, Li M, Cao F (2022) Deep multi-graph neural networks with attention fusion for recommendation. Expert Syst Appl 191:116240
Article Google Scholar
Jiang N, Duan F, Chen H, Huang W, Liu X (2022) Mafi: Gnn-based multiple aggregators and feature interactions network for fraud detection over heterogeneous graph. IEEE Trans Big Data 8(4):905–919
Article Google Scholar
Zhao J, Liu X, Yan Q, Li B, Shao M, Peng H (2020) Multi-attributed heterogeneous graph convolutional network for bot detection. Inf Sci 537:380–393
Article Google Scholar
Li Z, Chen X, Song J, Gao J (2022) Adaptive label propagation for group anomaly detection in large-scale networks. IEEE Trans Knowl Data Eng
Liu F, Li Z, Wang B, Wu J, Yang J, Huang J, Zhang Y, Wang W, Xue S, Nepal S et al (2022)eriskcom: an e-commerce risky community detection platform. VLDB J 1–17
Cao Q, Yang X, Yu J, Palow C (2014) Uncovering large groups of active malicious accounts in online social networks. In: Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pp 477–488
Tan E, Guo L, Chen S, Zhang X, Zhao Y (2013) Unik: unsupervised social network spam detection. In: Proceedings of the 22nd ACM international conference on information & knowledge management, pp 479–488
Dou Y, Liu Z, Sun L, Deng Y, Peng H, Yu PS (2020) Enhancing graph neural network-based fraud detectors against camouflaged fraudsters. In: Proceedings of the 29th ACM international conference on information & knowledge management, pp 315–324
Liu Z, Dou Y, Yu P.S, Deng Y, Peng H (2020) Alleviating the inconsistency problem of applying graph neural network to fraud detection. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp 1569–1572
Hao Y, Zhang F (2021) An unsupervised detection method for shilling attacks based on deep learning and community detection. Soft Comput 25(1):477–494
Article Google Scholar
Zhang G, Li Z, Huang J, Wu J, Zhou C, Yang J, Gao J (2022) efraudcom: an e-commerce fraud detection system via competitive graph neural networks. ACM Trans Inf Syst 40(3):1–29
Article Google Scholar
Ching A, Edunov S, Kabiljo M, Logothetis D, Muthukrishnan S (2015) One trillion edges: graph processing at facebook-scale. Proc VLDB Endowm 8(12):1804–1815
Article Google Scholar
Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data, pp 135–146
Salihoglu S, Widom J (2013) Gps: a graph processing system. In: Proceedings of the 25th international conference on scientific and statistical database management, pp 1–12
Khayyat Z, Awara K, Alonazi A, Jamjoom H, Williams D, Kalnis P (2013) Mizan: a system for dynamic load balancing in large-scale graph processing. In: Proceedings of the 8th ACM European conference on computer systems, pp 169–182
Gonzalez J.E, Low Y, Gu H, Bickson D, Guestrin C (2012) Powergraph: distributed graph-parallel computation on natural graphs. In: 10th \(\{\)USENIX\(\}\) symposium on operating systems design and implementation (\(\{\)OSDI\(\}\) 12), pp 17–30
Xu J, Li Z, Zeng W, Huang J (2020) Graph computing system and application based on large-scale information network. In: International conference on space information network. Springer, pp 158–178
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd Acm Sigkdd international conference on knowledge discovery and data mining, pp 785–794
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):1–12
Article Google Scholar
Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):1–12
Article Google Scholar

Download references

Acknowledgements

This paper is supported by the China Postdoctoral Science Foundation (2021M692957), and the National Natural Science Foundation of China (62172372).

Author information

Authors and Affiliations

Alibaba-Zhejiang University Joint Institute of Frontier Technologies, Hangzhou, 310058, Zhejiang, China
Zhao Li
Zhejiang Lab, Hangzhou, 311121, Zhejiang, China
Zhao Li, Biao Wang, Yilun Jin & Zenghui Xu
Link2do Technology, Hangzhou, 311199, Zhejiang, China
Jiaming Huang
The University of Southern Queensland, Toowoomba, QLD, 4350, Australia
Ji Zhang
Central South University, Changsha, 410083, Hunan, China
Jianliang Gao
Southeast University, Nanjing, 214135, Jiangsu, China
Yilun Jin

Authors

Zhao Li
View author publications
You can also search for this author in PubMed Google Scholar
Biao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jiaming Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yilun Jin
View author publications
You can also search for this author in PubMed Google Scholar
Zenghui Xu
View author publications
You can also search for this author in PubMed Google Scholar
Ji Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jianliang Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Biao Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, Z., Wang, B., Huang, J. et al. A graph-powered large-scale fraud detection system. Int. J. Mach. Learn. & Cyber. 15, 115–128 (2024). https://doi.org/10.1007/s13042-023-01786-w

Download citation

Received: 11 July 2022
Accepted: 18 January 2023
Published: 14 February 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s13042-023-01786-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A graph-powered large-scale fraud detection system

Abstract

Access this article

Similar content being viewed by others

IDGL: An Imbalanced Disassortative Graph Learning Framework for Fraud Detection

Improving fraud detection via imbalanced graph structure learning

Purchase Pattern Based Anti-Fraud Framework in Online E-Commerce Platform Using Graph Neural Network

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A graph-powered large-scale fraud detection system

Abstract

Access this article

Similar content being viewed by others

IDGL: An Imbalanced Disassortative Graph Learning Framework for Fraud Detection

Improving fraud detection via imbalanced graph structure learning

Purchase Pattern Based Anti-Fraud Framework in Online E-Commerce Platform Using Graph Neural Network

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation