short-paper

Datasets and Interfaces for Benchmarking Heterogeneous Graph Neural Networks

Authors:

Chuan ShiAuthors Info & Claims

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Pages 5346 - 5350

https://doi.org/10.1145/3583780.3615117

Published: 21 October 2023 Publication History

Abstract

In recent years, Heterogeneous Graph Neural Networks (HGNNs) have gained increasing attention due to their excellent performance in applications. However, the lack of high-quality benchmarks in new fields has become a critical limitation for developing and applying HGNNs. To accommodate the urgent need for emerging fields and the advancement of HGNNs, we present two large-scale, real-world, and challenging heterogeneous graph datasets from real scenarios: risk commodity detection and takeout recommendation. Meanwhile, we establish standard benchmark interfaces that provide over 40 heterogeneous graph datasets. We provide initial data split, unified evaluation metrics, and baseline results for future work, making it fair and handy to explore state-of-the-art HGNNs. Our interfaces also offer a comprehensive toolkit to research the characteristics of graph datasets. The above new datasets are publicly available on https://zenodo.org/communities/hgd, and the interface codes are available at https://github.com/BUPT-GAMMA/hgbi.

References

[1]

Yukuo Cen, Xu Zou, Jianwei Zhang, Hongxia Yang, Jingren Zhou, and Jie Tang. 2019. Representation learning for attributed multiplex heterogeneous network. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 1358--1368.

Digital Library

[2]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[3]

Xinyu Fu, Jiani Zhang, Ziqiao Meng, and Irwin King. 2020. Magnn: Metapath aggregated graph neural network for heterogeneous graph embedding. In Proceedings of The Web Conference 2020. 2331--2341.

Digital Library

[4]

Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, et al. 2020. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, Vol. 33 (2020), 21271--21284.

[5]

Shuyun Gu, Xiao Wang, Chuan Shi, and Ding Xiao. 2022. Self-supervised Graph Neural Networks for Multi-behavior Recommendation. In International Joint Conference on Artificial Intelligence (IJCAI).

[6]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems, Vol. 30 (2017).

[7]

Hui Han, Tianyu Zhao, Cheng Yang, Hongyi Zhang, Yaoqi Liu, Xiao Wang, and Chuan Shi. 2022. Openhgnn: an open source toolkit for heterogeneous graph neural network. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 3993--3997.

Digital Library

[8]

Binbin Hu, Yuan Fang, and Chuan Shi. 2019. Adversarial learning on heterogeneous information networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 120--129.

Digital Library

[9]

Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, and Jure Leskovec. 2021. Ogb-lsc: A large-scale challenge for machine learning on graphs. arXiv preprint arXiv:2103.09430 (2021).

[10]

Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems, Vol. 33 (2020), 22118--22133.

[11]

Houye Ji, Cheng Yang, Chuan Shi, and Pan Li. 2021b. Heterogeneous graph neural network with distance encoding. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 1138--1143.

[12]

Yugang Ji, Guanyi Chu, Xiao Wang, Chuan Shi, Jianan Zhao, and Junping Du. 2022. Prohibited Item Detection via Risk Graph Structure Learning. In WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25 - 29, 2022, Fré dé rique Laforest, Raphaë l Troncy, Elena Simperl, Deepak Agarwal, Aristides Gionis, Ivan Herman, and Lionel Mé dini (Eds.). ACM, 1434--1443. https://doi.org/10.1145/3485447.3512190

Digital Library

[13]

Yugang Ji, Chuan Shi, and Xiao Wang. 2021a. Prohibited Item Detection on Heterogeneous Risk Graphs. In CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1 - 5, 2021, Gianluca Demartini, Guido Zuccon, J. Shane Culpepper, Zi Huang, and Hanghang Tong (Eds.). ACM, 3867--3877. https://doi.org/10.1145/3459637.3481945

Digital Library

[14]

Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).

[15]

Ao Li, Yugang Ji, Guanyi Chu, Xiao Wang, Dong Li, and Chuan Shi. 2023. Clustering-Based Supervised Contrastive Learning for Identifying Risk Items on Heterogeneous Graph. In ICASSP 2023--2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1--5.

[16]

Jiawei Liu, Chuan Shi, Cheng Yang, Zhiyuan Lu, and S Yu Philip. 2022. A survey on heterogeneous information network based recommender systems: Concepts, methods, applications and resources. AI Open, Vol. 3 (2022), 40--57.

[17]

Qingsong Lv, Ming Ding, Qiang Liu, Yuxiang Chen, Wenzheng Feng, Siming He, Chang Zhou, Jianguo Jiang, Yuxiao Dong, and Jie Tang. 2021. Are we really making much progress? revisiting, benchmarking and refining heterogeneous graph neural networks. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. 1150--1160.

Digital Library

[18]

M Marvi-Mashhadi, CS Lopes, and J LLorca. 2020. High fidelity simulation of the mechanical behavior of closed-cell polyurethane foams. Journal of the Mechanics and Physics of Solids, Vol. 135 (2020), 103814.

[19]

Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3--7, 2018, Proceedings 15. Springer, 593--607.

Digital Library

[20]

Chuan Shi, Yitong Li, Jiawei Zhang, Yizhou Sun, and Philip S. Yu. 2017. A survey of heterogeneous information network analysis. IEEE Transactions on Knowledge and Data Engineering, Vol. 29, 1 (2017), 17--37. https://doi.org/10.1109/TKDE.2016.2598561

Digital Library

[21]

Shikhar Vashishth, Soumya Sanyal, Vikram Nitin, and Partha Talukdar. 2019. Composition-based multi-relational graph convolutional networks. arXiv preprint arXiv:1911.03082 (2019).

[22]

Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).

[23]

Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou, Chao Ma, Lingfan Yu, Yu Gai, et al. 2019b. Deep graph library: A graph-centric, highly-performant package for graph neural networks. arXiv preprint arXiv:1909.01315 (2019).

[24]

Xiao Wang, Deyu Bo, Chuan Shi, Shaohua Fan, Yanfang Ye, and S Yu Philip. 2022. A survey on heterogeneous graph embedding: methods, techniques, applications and sources. IEEE Transactions on Big Data (2022).

[25]

Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu. 2019a. Heterogeneous graph attention network. In The world wide web conference. 2022--2032.

[26]

Carl Yang, Yuxin Xiao, Yu Zhang, Yizhou Sun, and Jiawei Han. 2020. Heterogeneous network representation learning: A unified framework with survey and benchmark. IEEE Transactions on Knowledge and Data Engineering, Vol. 34, 10 (2020), 4854--4873.

Digital Library

[27]

Tianchi Yang, Cheng Yang, Luhao Zhang, Chuan Shi, Maodi Hu, Huaijun Liu, Tao Li, and Dong Wang. 2022a. Co-clustering interactions via attentive hypergraph neural network. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 859--869.

Digital Library

[28]

Tianchi Yang, Luhao Zhang, Chuan Shi, Cheng Yang, Siyong Xu, Ruiyu Fang, Maodi Hu, Huaijun Liu, Tao Li, and Dong Wang. 2022b. Gated Hypergraph Neural Network for Scene-Aware Recommendation. In International Conference on Database Systems for Advanced Applications. Springer, 199--215.

[29]

Yaming Yang, Ziyu Guan, Jianxin Li, Wei Zhao, Jiangtao Cui, and Quan Wang. 2021. Interpretable and efficient heterogeneous graph convolutional network. IEEE Transactions on Knowledge and Data Engineering (2021).

[30]

Zhilin Yang, William Cohen, and Ruslan Salakhudinov. 2016. Revisiting semi-supervised learning with graph embeddings. In International conference on machine learning. PMLR, 40--48.

[31]

Lingfan Yu, Jiajun Shen, Jinyang Li, and Adam Lerer. 2020. Scalable graph neural networks for heterogeneous graphs. arXiv preprint arXiv:2011.09679 (2020).

[32]

Seongjun Yun, Minbyul Jeong, Sungdong Yoo, Seunghun Lee, S Yi Sean, Raehyun Kim, Jaewoo Kang, and Hyunwoo J Kim. 2022. Graph Transformer Networks: Learning meta-path graphs to improve GNNs. Neural Networks, Vol. 153 (2022), 104--119.

Digital Library

[33]

Chuxu Zhang, Dongjin Song, Chao Huang, Ananthram Swami, and Nitesh V Chawla. 2019. Heterogeneous graph neural network. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 793--803.

Digital Library

[34]

Luhao Zhang, Ruiyu Fang, Tianchi Yang, Maodi Hu, Tao Li, Chuan Shi, and Dong Wang. 2022. A Joint Framework for Explainable Recommendation with Knowledge Reasoning and Graph Representation. In International Conference on Database Systems for Advanced Applications. Springer, 351--363.

Digital Library

Index Terms

Datasets and Interfaces for Benchmarking Heterogeneous Graph Neural Networks
1. Computing methodologies
  1. Artificial intelligence
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Spectral Heterogeneous Graph Convolutions via Positive Noncommutative Polynomials
WWW '24: Proceedings of the ACM Web Conference 2024

Heterogeneous Graph Neural Networks (HGNNs) have gained significant popularity in various heterogeneous graph learning tasks. However, most existing HGNNs rely on spatial domain-based methods to aggregate information, i.e., manually selected meta-paths ...
Heterogeneous graph neural network with graph-data augmentation and adaptive denoising
Abstract
Heterogeneous graphs are especially important in our daily life, which describe objects and their connections through nodes and edges. For this complex network structure, many heterogeneous graph neural networks have been designed, but the ...
Personalised meta-path generation for heterogeneous graph neural networks
Abstract
Recently, increasing attention has been paid to heterogeneous graph representation learning (HGRL), which aims to embed rich structural and semantic information in heterogeneous information networks (HINs) into low-dimensional node ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

October 2023

5508 pages

ISBN:9798400701245

DOI:10.1145/3583780

General Chairs:
Ingo Frommholz
University of Wolverhampton, UK
,
Frank Hopfgartner
University of Koblenz, Germany
,
Mark Lee
University of Birmingham, UK
,
Michael Oakes
University of Birmingham, UK
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Min Zhang
Tsinghua University, China
,
Rodrygo Santos
Federal University of Minas Gerais, Brazil

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

National Natural Science Foundation of China

Conference

CIKM '23

Sponsor:

CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2023

Birmingham, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
208
Total Downloads

Downloads (Last 12 months)148
Downloads (Last 6 weeks)8

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten