skip to main content
10.1145/3394486.3403063acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

NodeAug: Semi-Supervised Node Classification with Data Augmentation

Published: 20 August 2020 Publication History

Abstract

By using Data Augmentation (DA), we present a new method to enhance Graph Convolutional Networks (GCNs), that are the state-of-the-art models for semi-supervised node classification. DA for graph data remains under-explored. Due to the connections built by edges, DA for different nodes influence each other and lead to undesired results, such as uncontrollable DA magnitudes and changes of ground-truth labels. To address this issue, we present the NodeAug (Node-Parallel Augmentation) scheme, that creates a 'parallel universe' for each node to conduct DA, to block the undesired effects from other nodes. NodeAug regularizes the model prediction of every node (including unlabeled) to be invariant with respect to changes induced by Data Augmentation (DA), so as to improve the effectiveness. To augment the input features from different aspects, we propose three DA strategies by modifying both node attributes and the graph structure. In addition, we introduce the subgraph mini-batch training for the efficient implementation of NodeAug. The approach takes the subgraph corresponding to the receptive fields of a batch of nodes as the input per iteration, rather than the whole graph that the prior full-batch training takes. Empirically, NodeAug yields significant gains for strong GCN models on the Cora, Citeseer, Pubmed, and two co-authorship networks, with a more efficient training process thanks to the proposed subgraph mini-batch training approach.

Supplementary Material

MP4 File (3394486.3403063.mp4)
This video is for presenting our paper: 'NodeAug: Semi-Supervised Node Classification with Data Augmentation'.

References

[1]
David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, and Colin A Raffel. 2019. Mixmatch: A holistic approach to semi-supervised learning. In Advances in Neural Information Processing Systems. 5050--5060.
[2]
Marcus D Bloice, Christof Stocker, and Andreas Holzinger. 2017. Augmentor: An Image Augmentation Library for Machine Learning. arXiv preprint arXiv:1708.04680 (2017).
[3]
Andreas Buja, Dianne Cook, and Deborah F Swayne. 1996. Interactive high-dimensional data visualization. Journal of computational and graphical statistics, Vol. 5, 1 (1996), 78--99.
[4]
Aydin Bulucc and Kamesh Madduri. 2011. Parallel breadth-first search on distributed memory systems. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. 1--12.
[5]
Olivier Chapelle and Alexander Zien. 2005. Semi-supervised classification by low density separation. In AISTATS, Vol. 2005. Citeseer, 57--64.
[6]
Kevin Clark, Minh-Thang Luong, Christopher D Manning, and Quoc V Le. 2018. Semi-supervised sequence modeling with cross-view training. arXiv preprint arXiv:1809.08370 (2018).
[7]
Zhijie Deng, Yinpeng Dong, and Jun Zhu. 2019. Batch virtual adversarial training for graph convolutional networks. arXiv preprint arXiv:1902.09192 (2019).
[8]
Terrance DeVries and Graham W Taylor. 2017. Improved Regularization of Convolutional Neural Networks with Cutout. arXiv preprint arXiv:1708.04552 (2017).
[9]
Ming Ding, Jie Tang, and Jie Zhang. 2018. Semi-supervised learning on graphs with generative adversarial nets. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 913--922.
[10]
Holger Ebel, Lutz-Ingo Mielsch, and Stefan Bornholdt. 2002. Scale-free topology of e-mail networks. Physical review E, Vol. 66, 3 (2002), 035103.
[11]
Fuli Feng, Xiangnan He, Jie Tang, and Tat-Seng Chua. 2019. Graph adversarial training: Dynamically regularizing based on graph structure. IEEE Transactions on Knowledge and Data Engineering (2019).
[12]
Hongyang Gao and Shuiwang Ji. 2019. Graph U-Nets. arXiv preprint arXiv:1905.05178 (2019).
[13]
Hongyang Gao, Zhengyang Wang, and Shuiwang Ji. 2018. Large-Scale Learnable Graph Convolutional Networks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1416--1424.
[14]
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep learning .MIT press.
[15]
James M Joyce. 2011. Kullback-leibler divergence. International encyclopedia of statistical science (2011), 720--722.
[16]
Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, and Ping Tak Peter Tang. 2016. On large-batch training for deep learning: Generalization gap and sharp minima. arXiv preprint arXiv:1609.04836 (2016).
[17]
Thomas N Kipf and Max Welling. 2016. Semi-supervised Classification With Graph Convolutional Networks. arXiv preprint arXiv:1609.02907 (2016).
[18]
Samuli Laine and Timo Aila. 2016. Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242 (2016).
[19]
Ben London and Lise Getoor. 2014. Collective Classification of Network Data. Data Classification: Algorithms and Applications, Vol. 399 (2014).
[20]
Lijuan Luo, Martin Wong, and Wen-mei Hwu. 2010. An effective GPU implementation of breadth-first search. In Design Automation Conference. IEEE, 52--55.
[21]
Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, and Shin Ishii. 2018. Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE transactions on pattern analysis and machine intelligence, Vol. 41, 8 (2018), 1979--1993.
[22]
Meng Qu, Yoshua Bengio, and Jian Tang. 2019. Gmnn: Graph markov neural networks. arXiv preprint arXiv:1905.06214 (2019).
[23]
Pei Quan, Yong Shi, Minglong Lei, Jiaxu Leng, Tianlin Zhang, and Lingfeng Niu. 2019. A Brief Review of Receptive Fields in Graph Convolutional Networks. In IEEE/WIC/ACM International Conference on Web Intelligence-Volume 24800. ACM, 106--110.
[24]
Oleksandr Shchur, Maximilian Mumme, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868 (2018).
[25]
Krishna Kumar Singh, Hao Yu, Aron Sarmasi, Gautam Pradeep, and Yong Jae Lee. 2018. Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond. arXiv preprint arXiv:1811.02545 (2018).
[26]
Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph Attention Networks. arXiv preprint arXiv:1710.10903 (2017).
[27]
Vikas Verma, Alex Lamb, Juho Kannala, Yoshua Bengio, and David Lopez-Paz. 2019 a. Interpolation consistency training for semi-supervised learning. arXiv preprint arXiv:1903.03825 (2019).
[28]
Vikas Verma, Meng Qu, Alex Lamb, Yoshua Bengio, Juho Kannala, and Jian Tang. 2019 b. GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning. arXiv preprint arXiv:1909.11715 (2019).
[29]
Xiaoyun Wang, Joe Eaton, Cho-Jui Hsieh, and Felix Wu. 2018. Attack Graph Convolutional Networks by Adding Fake Nodes. arXiv preprint arXiv:1810.10751 (2018).
[30]
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S Yu. 2019. A Comprehensive Survey on Graph Neural Networks. arXiv preprint arXiv:1901.00596 (2019).
[31]
Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, and Quoc V Le. 2019. Unsupervised Data Augmentation. arXiv preprint arXiv:1904.12848 (2019).
[32]
Zhilin Yang, William W Cohen, and Ruslan Salakhutdinov. 2016. Revisiting semi-supervised learning with graph embeddings. arXiv preprint arXiv:1603.08861 (2016).
[33]
Hongyi Zhang, Moustapha Cisse, Yann N Dauphin, and David Lopez-Paz. 2017. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017).
[34]
Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. 2017. Random erasing data augmentation. arXiv preprint arXiv:1708.04896 (2017).

Cited By

View all
  • (2025)Rethinking Unsupervised Graph Anomaly Detection With Deep Learning: Residuals and ObjectivesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.350130737:2(881-895)Online publication date: Feb-2025
  • (2025)Data-Centric Graph Learning: A SurveyIEEE Transactions on Big Data10.1109/TBDATA.2024.348941211:1(1-20)Online publication date: Feb-2025
  • (2025)Tensor Representation-Based Multiview Graph Contrastive Learning for IoE IntelligenceIEEE Internet of Things Journal10.1109/JIOT.2024.341561212:4(3482-3492)Online publication date: 15-Feb-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
August 2020
3664 pages
ISBN:9781450379984
DOI:10.1145/3394486
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data augmentation
  2. graph convolutional networks
  3. graph mining
  4. semi-supervised learning

Qualifiers

  • Research-article

Funding Sources

  • NUS ODPRT Grant
  • Singapore Ministry of Education Academic Research Fund Tier 3 under MOEs official grant

Conference

KDD '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)195
  • Downloads (Last 6 weeks)10
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Rethinking Unsupervised Graph Anomaly Detection With Deep Learning: Residuals and ObjectivesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.350130737:2(881-895)Online publication date: Feb-2025
  • (2025)Data-Centric Graph Learning: A SurveyIEEE Transactions on Big Data10.1109/TBDATA.2024.348941211:1(1-20)Online publication date: Feb-2025
  • (2025)Tensor Representation-Based Multiview Graph Contrastive Learning for IoE IntelligenceIEEE Internet of Things Journal10.1109/JIOT.2024.341561212:4(3482-3492)Online publication date: 15-Feb-2025
  • (2025)A survey of graph neural networks and their industrial applicationsNeurocomputing10.1016/j.neucom.2024.128761614(128761)Online publication date: Jan-2025
  • (2025)LeDA-GNN: Learnable dual augmentation for graph neural networksExpert Systems with Applications10.1016/j.eswa.2024.126288268(126288)Online publication date: Apr-2025
  • (2025)Enhancing link prediction in graph data augmentation through graphon mixupNeural Computing and Applications10.1007/s00521-024-10923-7Online publication date: 10-Jan-2025
  • (2025)Enhancing School Success Prediction with FRC and Merged GNNSocial Networks Analysis and Mining10.1007/978-3-031-78548-1_20(262-277)Online publication date: 24-Jan-2025
  • (2025)Evaluating Deep Graph Network Performance by Augmenting Node Features with Structural FeaturesSocial Networks Analysis and Mining10.1007/978-3-031-78548-1_12(131-147)Online publication date: 24-Jan-2025
  • (2024)HOGDA: Boosting Semi-supervised Graph Domain Adaptation via High-Order Structure-Guided Adaptive Feature AlignmentProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680765(11109-11118)Online publication date: 28-Oct-2024
  • (2024)GraphLearner: Graph Node Clustering with Fully Learnable AugmentationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680602(5517-5526)Online publication date: 28-Oct-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media