skip to main content
10.1145/3459637.3481945acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Prohibited Item Detection on Heterogeneous Risk Graphs

Published: 30 October 2021 Publication History

Abstract

Prohibited item detection, which aims to detect illegal items hidden on e-commerce platforms, plays a significant role in evading risks and preventing crimes for online shopping. While traditional solutions usually focus on mining evidence from independent items, they cannot effectively utilize the rich structural relevance among different items. A naive idea is to directly deploy existing supervised graph neural networks to learn node representations for item classification. However, the very few manually labeled items with various risk patterns introduce two essential challenges: (1) How to enhance the representations of enormous unlabeled items? (2) How to enrich the supervised information in this few-labeled but multiple-pattern business scenario? In this paper, we construct item logs as a Heterogeneous Risk Graph (HRG), and propose the novel Heterogeneous Self-supervised Prohibited item Detection model (HSPD) to overcome these challenges. HSPD first designs the heterogeneous self-supervised learning model, which treats multiple semantics as the supervision to enhance item representations. Then, it presents the directed pairwise labeling to learn the distance from candidates to their most relevant prohibited seeds, which tackles the binary-labeled multi-patterned risks. Finally, HSPD integrates with self-training mechanisms to iteratively expand confident pseudo labels for enriching supervision. The extensive offline and online experimental results on three real-world HRGs demonstrate that HSPD consistently outperforms the state-of-the-art alternatives.

References

[1]
Hongyun Cai, Vincent W. Zheng, and Kevin Chen-Chuan Chang. 2018. A Com-prehensive Survey of Graph Embedding: Problems, Techniques, and Applications. IEEE Transactions on Knowledge and Data Engineering 30, 9 (2018), 1616--1637.
[2]
Yukuo Cen, Xu Zou, Jianwei Zhang, Hongxia Yang, Jingren Zhou, and Jie Tang. 2019. Representation Learning for Attributed Multiplex Heterogeneous Network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4--8, 2019. ACM, 1358--1368.
[3]
Shaohua Fan, Junxiong Zhu, Xiaotian Han, Chuan Shi, Linmei Hu, Biyu Ma, and Yongliang Li. 2019. Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019. ACM, 2478--2486.
[4]
Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics (2001), 1189--1232.
[5]
Alberto García-Durán and Mathias Niepert. 2017. Learning Graph Representa-tions with Embedding Propagation. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA. 5119--5130.
[6]
William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive Represen-tation Learning on Large Graphs. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA. 1024--1034.
[7]
Guoxiu He, Yangyang Kang, Zhe Gao, Zhuoren Jiang, Changlong Sun, Xiaozhong Liu, Wei Lu, Qiong Zhang, and Luo Si. 2019. Finding Camouflaged Needle in a Haystack?: Pornographic Products Detection via Berrypicking Tree Model. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, July 21-25, 2019. ACM, 365--374.
[8]
Shifu Hou, Yanfang Ye, Yangqiu Song, and Melih Abdulhayoglu. 2017. HinDroid: An Intelligent Android Malware Detection System Based on Structured Heteroge-neous Information Network. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13 - 17, 2017. ACM, 1507--1515.
[9]
Ziniu Hu, Yuxiao Dong, Kuansan Wang, Kai-Wei Chang, and Yizhou Sun. 2020. GPT-GNN: Generative Pre-Training of Graph Neural Networks. In KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020. ACM, 1857--1867.
[10]
Ziniu Hu, Yuxiao Dong, Kuansan Wang, and Yizhou Sun. 2020. Heterogeneous Graph Transformer. In WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020. ACM / IW3C2, 2704--2710.
[11]
Dasol Hwang, Jinyoung Park, Sunyoung Kwon, Kyung-Min Kim, Jung-Woo Ha, and Hyunwoo J. Kim. 2020. Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
[12]
Yugang Ji, Chuan Shi, Fuzhen Zhuang, and Philip S. Yu. 2019. Integrating Topic Model and Heterogeneous Information Network for Aspect Mining with Rating Bias. In Advances in Knowledge Discovery and Data Mining - 23rd Pacific-Asia Conference, PAKDD 2019, Macau, China, April 14-17, 2019, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 11439). Springer, 160--171.
[13]
Yugang Ji, Mingyang Yin, Hongxia Yang, Jingren Zhou, Vincent W. Zheng, Chuan Shi, and Yuan Fang. 2021. Accelerating Large-Scale Heterogeneous Interaction Graph Embedding Learning via Importance Sampling. ACM Transactions on Knowledge Discovery from Data 15, 1 (2021), 10:1--10:23.
[14]
Wei Jin, Tyler Derr, Haochen Liu, Yiqi Wang, Suhang Wang, Zitao Liu, and Jiliang Tang. 2020. Self-supervised Learning on Graphs: Deep Insights and New Direction. CoRR abs/2006.10141 (2020).
[15]
Jongmin Kim, Taesup Kim, Sungwoong Kim, and Chang D. Yoo. 2019. Edge-Labeling Graph Neural Network for Few-Shot Learning. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. Computer Vision Foundation / IEEE, 11--20.
[16]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Opti-mization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.).
[17]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net.
[18]
Qimai Li, Zhichao Han, and Xiao-Ming Wu. 2018. Deeper Insights Into Graph Convolutional Networks for Semi-Supervised Learning. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018. AAAI Press, 3538--3545.
[19]
Xiao Liu, Fanjin Zhang, Zhenyu Hou, Zhaoyu Wang, Li Mian, Jing Zhang, and Jie Tang. 2020. Self-supervised Learning: Generative or Contrastive. CoRR abs/2006.08218 (2020).
[20]
Yuanfu Lu, Yuan Fang, and Chuan Shi. 2020. Meta-learning on Heterogeneous Information Networks for Cold-start Recommendation. In KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020. ACM, 1563--1573.
[21]
Jianxin Ma, Chang Zhou, Hongxia Yang, Peng Cui, Xin Wang, and Wenwu Zhu. 2020. Disentangled Self-Supervision in Sequential Recommenders. In KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23--27, 2020. ACM, 483--491.
[22]
Tomás Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Esti-mation of Word Representations in Vector Space. In 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings.
[23]
Chanyoung Park, Donghyun Kim, Jiawei Han, and Hwanjo Yu. 2020. Unsu-pervised Attributed Multiplex Network Embedding. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 5371--5378.
[24]
Daniel Carlos Guimarães Pedronette and Longin Jan Latecki. 2021. Rank-based self-training for graph convolutional networks. Information Processing and Management 58, 2 (2021), 102443.
[25]
Zhen Peng, Wenbing Huang, Minnan Luo, Qinghua Zheng, Yu Rong, Tingyang Xu, and Junzhou Huang. 2020. Graph Representation Learning via Graphical Mutual Information Maximization. In WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020. ACM / IW3C2, 259--270.
[26]
Michael Sejr Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, and Max Welling. 2018. Modeling Relational Data with Graph Convolutional Networks. In The Semantic Web - 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3-7, 2018, Proceedings (Lecture Notes in Computer Science, Vol. 10843). Springer, 593--607.
[27]
Chuan Shi, Yitong Li, Jiawei Zhang, Yizhou Sun, and S Yu Philip. 2016. A survey of heterogeneous information network analysis. IEEE Transactions on Knowledge and Data Engineering 29, 1 (2016), 17--37.
[28]
Kaisong Song, Yangyang Kang, Wei Gao, Zhe Gao, Changlong Sun, and Xiaozhong Liu. 2021. Evidence Aware Neural Pornographic Text Identification for Child Protection. In AAAI.
[29]
Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, and Jian Tang. 2020. InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenRe-view.net.
[30]
Ke Sun, Zhouchen Lin, and Zhanxing Zhu. 2020. Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labeled Nodes. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 5892--5899.
[31]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2017. Graph Attention Networks. CoRR abs/1710.10903 (2017).
[32]
Petar Velickovic, William Fedus, William L. Hamilton, Pietro Liò, Yoshua Bengio, and R. Devon Hjelm. 2019. Deep Graph Infomax. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net.
[33]
Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S. Yu. 2019. Heterogeneous Graph Attention Network. In The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019. ACM, 2022--2032.
[34]
Xiao Wang, Ruijia Wang, Chuan Shi, Guojie Song, and Qingyong Li. 2020. Multi-Component Graph Convolutional Collaborative Filtering. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 6267--6274.
[35]
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. 2021. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems. 32, 1 (2021), 4--24.
[36]
Hong Xuan, Abby Stylianou, and Robert Pless. 2020. Improved Embeddings with Easy Positive Triplet Mining. In IEEE Winter Conference on Applications of Computer Vision, WACV 2020, Snowmass Village, CO, USA, March 1-5, 2020. IEEE, 2463--2471.
[37]
Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and Yang Shen. 2020. Graph Contrastive Learning with Augmentations. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Infor-mation Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
[38]
Wentao Zhang, Yuan Fang, Zemin Liu, Min Wu, and Xinming Zhang. 2020. mg2vec: Learning Relationship-Preserving Heterogeneous Graph Representations via Metagraph Embedding. IEEE Transactions on Knowledge and Data Engineering (2020), 1--1. https://doi.org/10.1109/TKDE.2020.2992500
[39]
Vincent W. Zheng, Mo Sha, Yuchen Li, Hongxia Yang, Yuan Fang, Zhenjie Zhang, Kian-Lee Tan, and Kevin Chen-Chuan Chang. 2018. Heterogeneous Embedding Propagation for Large-Scale E-Commerce User Alignment. In IEEE International Conference on Data Mining, ICDM 2018, Singapore, November 17-20, 2018. IEEE Computer Society, 1434--1439.
[40]
Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. 2019. AliGraph: A Comprehensive Graph Neural Network Platform. Proceedings of VLDB Endowment. 12, 12 (2019), 2094--2105.
[41]
Barret Zoph, Golnaz Ghiasi, Tsung-Yi Lin, Yin Cui, Hanxiao Liu, Ekin Dogus Cubuk, and Quoc Le. 2020. Rethinking Pre-training and Self-training. In Ad-vances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.

Cited By

View all
  • (2024)SUCOLAEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107016126:PCOnline publication date: 1-Feb-2024
  • (2023)Datasets and Interfaces for Benchmarking Heterogeneous Graph Neural NetworksProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615117(5346-5350)Online publication date: 21-Oct-2023
  • (2023)Knowledge Based Prohibited Item Detection on Heterogeneous Risk GraphsProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599852(5260-5269)Online publication date: 4-Aug-2023
  • Show More Cited By

Index Terms

  1. Prohibited Item Detection on Heterogeneous Risk Graphs

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
        October 2021
        4966 pages
        ISBN:9781450384469
        DOI:10.1145/3459637
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 30 October 2021

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. graph neural network
        2. heterogeneous self-supervised learning
        3. prohibited item detection
        4. self-training

        Qualifiers

        • Research-article

        Funding Sources

        Conference

        CIKM '21
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

        Upcoming Conference

        CIKM '25

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)20
        • Downloads (Last 6 weeks)2
        Reflects downloads up to 17 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)SUCOLAEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107016126:PCOnline publication date: 1-Feb-2024
        • (2023)Datasets and Interfaces for Benchmarking Heterogeneous Graph Neural NetworksProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615117(5346-5350)Online publication date: 21-Oct-2023
        • (2023)Knowledge Based Prohibited Item Detection on Heterogeneous Risk GraphsProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599852(5260-5269)Online publication date: 4-Aug-2023
        • (2023)Clustering-Based Supervised Contrastive Learning for Identifying Risk Items on Heterogeneous GraphICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP49357.2023.10094817(1-5)Online publication date: 4-Jun-2023

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media