skip to main content
10.1145/3583780.3615117acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Datasets and Interfaces for Benchmarking Heterogeneous Graph Neural Networks

Published: 21 October 2023 Publication History

Abstract

In recent years, Heterogeneous Graph Neural Networks (HGNNs) have gained increasing attention due to their excellent performance in applications. However, the lack of high-quality benchmarks in new fields has become a critical limitation for developing and applying HGNNs. To accommodate the urgent need for emerging fields and the advancement of HGNNs, we present two large-scale, real-world, and challenging heterogeneous graph datasets from real scenarios: risk commodity detection and takeout recommendation. Meanwhile, we establish standard benchmark interfaces that provide over 40 heterogeneous graph datasets. We provide initial data split, unified evaluation metrics, and baseline results for future work, making it fair and handy to explore state-of-the-art HGNNs. Our interfaces also offer a comprehensive toolkit to research the characteristics of graph datasets. The above new datasets are publicly available on https://zenodo.org/communities/hgd, and the interface codes are available at https://github.com/BUPT-GAMMA/hgbi.

References

[1]
Yukuo Cen, Xu Zou, Jianwei Zhang, Hongxia Yang, Jingren Zhou, and Jie Tang. 2019. Representation learning for attributed multiplex heterogeneous network. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 1358--1368.
[2]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[3]
Xinyu Fu, Jiani Zhang, Ziqiao Meng, and Irwin King. 2020. Magnn: Metapath aggregated graph neural network for heterogeneous graph embedding. In Proceedings of The Web Conference 2020. 2331--2341.
[4]
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, et al. 2020. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, Vol. 33 (2020), 21271--21284.
[5]
Shuyun Gu, Xiao Wang, Chuan Shi, and Ding Xiao. 2022. Self-supervised Graph Neural Networks for Multi-behavior Recommendation. In International Joint Conference on Artificial Intelligence (IJCAI).
[6]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems, Vol. 30 (2017).
[7]
Hui Han, Tianyu Zhao, Cheng Yang, Hongyi Zhang, Yaoqi Liu, Xiao Wang, and Chuan Shi. 2022. Openhgnn: an open source toolkit for heterogeneous graph neural network. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 3993--3997.
[8]
Binbin Hu, Yuan Fang, and Chuan Shi. 2019. Adversarial learning on heterogeneous information networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 120--129.
[9]
Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, and Jure Leskovec. 2021. Ogb-lsc: A large-scale challenge for machine learning on graphs. arXiv preprint arXiv:2103.09430 (2021).
[10]
Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems, Vol. 33 (2020), 22118--22133.
[11]
Houye Ji, Cheng Yang, Chuan Shi, and Pan Li. 2021b. Heterogeneous graph neural network with distance encoding. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 1138--1143.
[12]
Yugang Ji, Guanyi Chu, Xiao Wang, Chuan Shi, Jianan Zhao, and Junping Du. 2022. Prohibited Item Detection via Risk Graph Structure Learning. In WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25 - 29, 2022, Fré dé rique Laforest, Raphaë l Troncy, Elena Simperl, Deepak Agarwal, Aristides Gionis, Ivan Herman, and Lionel Mé dini (Eds.). ACM, 1434--1443. https://doi.org/10.1145/3485447.3512190
[13]
Yugang Ji, Chuan Shi, and Xiao Wang. 2021a. Prohibited Item Detection on Heterogeneous Risk Graphs. In CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1 - 5, 2021, Gianluca Demartini, Guido Zuccon, J. Shane Culpepper, Zi Huang, and Hanghang Tong (Eds.). ACM, 3867--3877. https://doi.org/10.1145/3459637.3481945
[14]
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
[15]
Ao Li, Yugang Ji, Guanyi Chu, Xiao Wang, Dong Li, and Chuan Shi. 2023. Clustering-Based Supervised Contrastive Learning for Identifying Risk Items on Heterogeneous Graph. In ICASSP 2023--2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1--5.
[16]
Jiawei Liu, Chuan Shi, Cheng Yang, Zhiyuan Lu, and S Yu Philip. 2022. A survey on heterogeneous information network based recommender systems: Concepts, methods, applications and resources. AI Open, Vol. 3 (2022), 40--57.
[17]
Qingsong Lv, Ming Ding, Qiang Liu, Yuxiang Chen, Wenzheng Feng, Siming He, Chang Zhou, Jianguo Jiang, Yuxiao Dong, and Jie Tang. 2021. Are we really making much progress? revisiting, benchmarking and refining heterogeneous graph neural networks. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. 1150--1160.
[18]
M Marvi-Mashhadi, CS Lopes, and J LLorca. 2020. High fidelity simulation of the mechanical behavior of closed-cell polyurethane foams. Journal of the Mechanics and Physics of Solids, Vol. 135 (2020), 103814.
[19]
Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3--7, 2018, Proceedings 15. Springer, 593--607.
[20]
Chuan Shi, Yitong Li, Jiawei Zhang, Yizhou Sun, and Philip S. Yu. 2017. A survey of heterogeneous information network analysis. IEEE Transactions on Knowledge and Data Engineering, Vol. 29, 1 (2017), 17--37. https://doi.org/10.1109/TKDE.2016.2598561
[21]
Shikhar Vashishth, Soumya Sanyal, Vikram Nitin, and Partha Talukdar. 2019. Composition-based multi-relational graph convolutional networks. arXiv preprint arXiv:1911.03082 (2019).
[22]
Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).
[23]
Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou, Chao Ma, Lingfan Yu, Yu Gai, et al. 2019b. Deep graph library: A graph-centric, highly-performant package for graph neural networks. arXiv preprint arXiv:1909.01315 (2019).
[24]
Xiao Wang, Deyu Bo, Chuan Shi, Shaohua Fan, Yanfang Ye, and S Yu Philip. 2022. A survey on heterogeneous graph embedding: methods, techniques, applications and sources. IEEE Transactions on Big Data (2022).
[25]
Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu. 2019a. Heterogeneous graph attention network. In The world wide web conference. 2022--2032.
[26]
Carl Yang, Yuxin Xiao, Yu Zhang, Yizhou Sun, and Jiawei Han. 2020. Heterogeneous network representation learning: A unified framework with survey and benchmark. IEEE Transactions on Knowledge and Data Engineering, Vol. 34, 10 (2020), 4854--4873.
[27]
Tianchi Yang, Cheng Yang, Luhao Zhang, Chuan Shi, Maodi Hu, Huaijun Liu, Tao Li, and Dong Wang. 2022a. Co-clustering interactions via attentive hypergraph neural network. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 859--869.
[28]
Tianchi Yang, Luhao Zhang, Chuan Shi, Cheng Yang, Siyong Xu, Ruiyu Fang, Maodi Hu, Huaijun Liu, Tao Li, and Dong Wang. 2022b. Gated Hypergraph Neural Network for Scene-Aware Recommendation. In International Conference on Database Systems for Advanced Applications. Springer, 199--215.
[29]
Yaming Yang, Ziyu Guan, Jianxin Li, Wei Zhao, Jiangtao Cui, and Quan Wang. 2021. Interpretable and efficient heterogeneous graph convolutional network. IEEE Transactions on Knowledge and Data Engineering (2021).
[30]
Zhilin Yang, William Cohen, and Ruslan Salakhudinov. 2016. Revisiting semi-supervised learning with graph embeddings. In International conference on machine learning. PMLR, 40--48.
[31]
Lingfan Yu, Jiajun Shen, Jinyang Li, and Adam Lerer. 2020. Scalable graph neural networks for heterogeneous graphs. arXiv preprint arXiv:2011.09679 (2020).
[32]
Seongjun Yun, Minbyul Jeong, Sungdong Yoo, Seunghun Lee, S Yi Sean, Raehyun Kim, Jaewoo Kang, and Hyunwoo J Kim. 2022. Graph Transformer Networks: Learning meta-path graphs to improve GNNs. Neural Networks, Vol. 153 (2022), 104--119.
[33]
Chuxu Zhang, Dongjin Song, Chao Huang, Ananthram Swami, and Nitesh V Chawla. 2019. Heterogeneous graph neural network. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 793--803.
[34]
Luhao Zhang, Ruiyu Fang, Tianchi Yang, Maodi Hu, Tao Li, Chuan Shi, and Dong Wang. 2022. A Joint Framework for Explainable Recommendation with Knowledge Reasoning and Graph Representation. In International Conference on Database Systems for Advanced Applications. Springer, 351--363.

Index Terms

  1. Datasets and Interfaces for Benchmarking Heterogeneous Graph Neural Networks

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
      October 2023
      5508 pages
      ISBN:9798400701245
      DOI:10.1145/3583780
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 21 October 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. benchmark
      2. graph
      3. heterogeneous graph neural networks
      4. risk commodity detection
      5. takeout recommendation

      Qualifiers

      • Short-paper

      Funding Sources

      Conference

      CIKM '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 208
        Total Downloads
      • Downloads (Last 12 months)148
      • Downloads (Last 6 weeks)8
      Reflects downloads up to 17 Feb 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media