skip to main content
10.1145/3580305.3599414acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Leveraging Relational Graph Neural Network for Transductive Model Ensemble

Published: 04 August 2023 Publication History

Abstract

Traditional methods of pre-training, fine-tuning, and ensembling often overlook essential relational data and task interconnections. To address this gap, our study presents a novel approach to harnessing this relational information via a relational graph-based model. We introduce Relational grAph Model ensemBLE model, abbreviated as RAMBLE. This model distinguishes itself by performing class label inference simultaneously across all data nodes and task nodes, employing the relational graph in a transductive manner. This fine-grained approach allows us to better comprehend and model the intricate interplay between data and tasks. Furthermore, we incorporate a novel variational information bottleneck-guided scheme for embedding fusion and aggregation. This innovative technique facilitates the creation of an informative fusion embedding, honing in on embeddings beneficial for the intended task while simultaneously filtering out potential noise-laden embeddings. Our theoretical analysis, grounded in information theory, confirms that the use of relational information for embedding fusion allows us to achieve higher upper and lower bounds on our target task's accuracy. We thoroughly assess our proposed model across eight diverse datasets, and the experimental results demonstrate the model's effective utilization of relational knowledge derived from all pre-trained models, thereby enhancing its performance on our target tasks.

Supplementary Material

MP4 File (rtfp0341-2min-promo.mp4)
Our paper introduces RAMBLE, a novel relational graph-based model that effectively extracts task-specific knowledge from multiple pre-trained models by performing class label inference across all data and task nodes simultaneously. By modeling relationships between data and tasks, RAMBLE optimizes the utilization of knowledge, enhancing target task performance. We employ a Variational Information Bottleneck technique to develop an effective aggregation scheme, mitigating the impact of irrelevant or redundant information. Theoretical proof confirms that integrating relational information in representation fusion improves accuracy bounds for the target task. Extensive experiments using eight different datasets demonstrate RAMBLE's superior performance compared to single-model transfer methods and ensemble methods of multi-pre-trained models.
MP4 File (meeting_02.mp4)
Our paper introduces RAMBLE, a novel relational graph-based model that effectively extracts task-specific knowledge from multiple pre-trained models by performing class label inference across all data and task nodes simultaneously. By modeling relationships between data and tasks, RAMBLE optimizes the utilization of knowledge, enhancing target task performance. We employ a Variational Information Bottleneck technique to develop an effective aggregation scheme, mitigating the impact of irrelevant or redundant information. Theoretical proof confirms that integrating relational information in representation fusion improves accuracy bounds for the target task. Extensive experiments using eight different datasets demonstrate RAMBLE's superior performance compared to single-model transfer methods and ensemble methods of multi-pre-trained models.

References

[1]
Fady Alajaji, Po-Ning Chen, et al. 2018. An Introduction to Single-User Information Theory. Springer.
[2]
Alexander Alemi, Ben Poole, Ian Fischer, Joshua Dillon, Rif A Saurous, and Kevin Murphy. 2018. Fixing a broken ELBO. In International Conference on Machine Learning. PMLR, 159--168.
[3]
Alexander A Alemi, Ian Fischer, Joshua V Dillon, and Kevin Murphy. 2016. Deep variational information bottleneck. arXiv preprint arXiv:1612.00410 (2016).
[4]
Daniel Bolya, Rohit Mittapalli, and Judy Hoffman. 2021a. Scalable Diverse Model Selection for Accessible Transfer Learning. In NeurIPS.
[5]
Daniel Bolya, Rohit Mittapalli, and Judy Hoffman. 2021b. Scalable Diverse Model Selection for Accessible Transfer Learning. Advances in Neural Information Processing Systems, Vol. 34 (2021), 19301--19312.
[6]
Gavin Brown. 2009. An information theoretic perspective on multiple classifier systems. In International Workshop on Multiple Classifier Systems. Springer, 344--353.
[7]
Jiahang Cao, Jinyuan Fang, Zaiqiao Meng, and Shangsong Liang. 2022a. Knowledge Graph Embedding: A Survey from the Perspective of Representation Spaces. arXiv preprint arXiv:2211.03536 (2022).
[8]
Kaidi Cao, Jiaxuan You, and Jure Leskovec. 2022b. Relational multi-task learning: Modeling relations between data and tasks. In International Conference on Representation Learning (ICLR).
[9]
Kaidi Cao, Jiaxuan You, and Jure Leskovec. 2023. Relational multi-task learning: Modeling relations between data and tasks. arXiv preprint arXiv:2303.07666 (2023).
[10]
Chih chan Tien and Shane Steinert-Threlkeld. 2021. Bilingual alignment transfers to multilingual alignment for unsupervised parallel text mining. In ACL.
[11]
Deli Chen, Yankai Lin, Wei Li, Peng Li, Jie Zhou, and Xu Sun. 2020. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 3438--3445.
[12]
Guanzheng Chen, Jinyuan Fang, Zaiqiao Meng, Qiang Zhang, and Shangsong Liang. 2022. Multi-Relational Graph Representation Learning with Bayesian Gaussian Process Network. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 5530--5538.
[13]
Jie Chen, Tengfei Ma, and Cao Xiao. 2018. Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv preprint arXiv:1801.10247 (2018).
[14]
Jianfei Chen, Jun Zhu, and Le Song. 2017. Stochastic training of graph convolutional networks with variance reduction. arXiv preprint arXiv:1710.10568 (2017).
[15]
M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, and A. Vedaldi. 2014. Describing Textures in the Wild. In Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR).
[16]
Thomas G Dietterich. 2000. Ensemble methods in machine learning. In International workshop on multiple classifier systems. Springer, 1--15.
[17]
Jinyuan Fang, Shangsong Liang, Zaiqiao Meng, and Maarten De Rijke. 2021a. Hyperspherical variational co-embedding for attributed networks. ACM Transactions on Information Systems (TOIS), Vol. 40, 3 (2021), 1--36.
[18]
Jinyuan Fang, Shangsong Liang, Zaiqiao Meng, and Qiang Zhang. 2021b. Gaussian process with graph convolutional kernel for relational learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 353--363.
[19]
Hongyang Gao and Shuiwang Ji. 2019. Graph representation learning via hard and channel-wise attention networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 741--749.
[20]
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 580--587.
[21]
Raphael Gontijo-Lopes, Yann Dauphin, and Ekin D Cubuk. 2021. No one representation to rule them all: Overlapping features of training methods. arXiv preprint arXiv:2110.12899 (2021).
[22]
Priya Goyal, Quentin Duval, Jeremy Reizenstein, Matthew Leavitt, Min Xu, Benjamin Lefaudeux, Mannat Singh, Vinicius Reis, Mathilde Caron, Piotr Bojanowski, Armand Joulin, and Ishan Misra. 2021. VISSL. https://github.com/facebookresearch/vissl.
[23]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems, Vol. 30 (2017).
[24]
Martin Hellman and Josef Raviv. 1970. Probability of error, equivocation, and the Chernoff bound. IEEE Transactions on Information Theory, Vol. 16, 4 (1970), 368--372.
[25]
Long-Kai Huang, Junzhou Huang, Yu Rong, Qiang Yang, and Ying Wei. 2022. Frustratingly easy transferability estimation. In International Conference on Machine Learning. PMLR, 9201--9225.
[26]
Wenbing Huang, Tong Zhang, Yu Rong, and Junzhou Huang. 2018. Adaptive sampling towards fast graph representation learning. Advances in neural information processing systems, Vol. 31 (2018).
[27]
Saachi Jain, Hadi Salman, Alaa Khaddaj, Eric Wong, Sung Min Park, and Aleksander Madry. 2022. A data-based perspective on transfer learning. arXiv preprint arXiv:2207.05739 (2022).
[28]
Eric Jang, Shixiang Gu, and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016).
[29]
Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Fei-Fei Li. 2011. Novel dataset for fine-grained image categorization: Stanford dogs. In Proc. CVPR workshop on fine-grained visual categorization (FGVC), Vol. 2. Citeseer.
[30]
Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
[31]
Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, and Neil Houlsby. 2020. Big transfer (bit): General visual representation learning. In European conference on computer vision. Springer, 491--507.
[32]
Simon Kornblith, Jonathon Shlens, and Quoc V Le. 2019. Do better imagenet models transfer better?. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2661--2671.
[33]
Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).
[34]
Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems, Vol. 30 (2017).
[35]
Hongkang Li, Meng Wang, Sijia Liu, Pin-Yu Chen, and Jinjun Xiong. 2022. Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling. In International Conference on Machine Learning. PMLR, 13014--13051.
[36]
Shangsong Liang, Zhuo Ouyang, and Zaiqiao Meng. 2021. A normalizing flow-based co-embedding model for attributed networks. ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 16, 3 (2021), 1--31.
[37]
Siwei Liu, Zaiqiao Meng, Craig Macdonald, and Iadh Ounis. 2023. Graph neural pre-training for recommendation with side information. ACM Transactions on Information Systems, Vol. 41, 3 (2023), 1--28.
[38]
Ziqi Liu, Chaochao Chen, Longfei Li, Jun Zhou, Xiaolong Li, Le Song, and Yuan Qi. 2019. Geniepath: Graph neural networks with adaptive receptive paths. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4424--4431.
[39]
Chunwei Ma, Ziyun Huang, Mingchen Gao, and Jinhui Xu. 2021a. Few-shot Learning via Dirichlet Tessellation Ensemble. In International Conference on Learning Representations.
[40]
Xiaojun Ma, Junshan Wang, Hanyue Chen, and Guojie Song. 2021b. Improving graph neural networks with structural adaptive receptive fields. In Proceedings of the Web Conference 2021. 2438--2447.
[41]
Chris J Maddison, Andriy Mnih, and Yee Whye Teh. 2016. The concrete distribution: A continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712 (2016).
[42]
S. Maji, J. Kannala, E. Rahtu, M. Blaschko, and A. Vedaldi. 2013. Fine-Grained Visual Classification of Aircraft. Technical Report. arxiv: 1306.5151 [cs-cv]
[43]
Keerthiram Murugesan, Vijay Sadashivaiah, Ronny Luss, Karthikeyan Shanmugam, Pin-Yu Chen, and Amit Dhurandhar. 2022. Auto-Transfer: Learning to Route Transferrable Representations. arXiv preprint arXiv:2202.01011 (2022).
[44]
Maria-Elena Nilsback and Andrew Zisserman. 2008. Automated flower classification over a large number of classes. In 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing. IEEE, 722--729.
[45]
Sinno Jialin Pan and Qiang Yang. 2010. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, Vol. 22 (2010), 1345--1359.
[46]
Michal Pándy, Andrea Agostinelli, Jasper Uijlings, Vittorio Ferrari, and Thomas Mensink. 2022. Transferability Estimation using Bhattacharyya Class Separability. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9172--9182.
[47]
Ariadna Quattoni and Antonio Torralba. 2009. Recognizing indoor scenes. In 2009 IEEE conference on computer vision and pattern recognition. IEEE, 413--420.
[48]
Yu Rong, Wenbing Huang, Tingyang Xu, and Junzhou Huang. 2019. Dropedge: Towards deep graph convolutional networks on node classification. arXiv preprint arXiv:1907.10903 (2019).
[49]
Zhiqiang Shen, Zechun Liu, Jie Qin, Marios Savvides, and Kwang-Ting Cheng. 2021. Partial is better than all: Revisiting fine-tuning strategy for few-shot learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 9594--9602.
[50]
Yang Shu, Zhi Kou, Zhangjie Cao, Jianmin Wang, and Mingsheng Long. 2021. Zoo-Tuning: Adaptive Transfer from a Zoo of Models. ArXiv, Vol. abs/2106.15434 (2021).
[51]
Qingyun Sun, Jianxin Li, Hao Peng, Jia Wu, Xingcheng Fu, Cheng Ji, and Philip S. Yu. 2021. Graph Structure Learning with Variational Information Bottleneck. In AAAI Conference on Artificial Intelligence.
[52]
Susheel Suresh, Pan Li, Cong Hao, and Jennifer Neville. 2021. Adversarial graph augmentation to improve graph contrastive learning. Advances in Neural Information Processing Systems, Vol. 34 (2021), 15920--15933.
[53]
Anh T Tran, Cuong V Nguyen, and Tal Hassner. 2019. Transferability and hardness of supervised classification tasks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1395--1405.
[54]
Catherine Wah, Steve Branson, Peter Welinder, Pietro Perona, and Serge Belongie. 2011. The caltech-ucsd birds-200--2011 dataset. (2011).
[55]
Xiao Wang, Meiqi Zhu, Deyu Bo, Peng Cui, Chuan Shi, and Jian Pei. 2020. Am-gcn: Adaptive multi-channel graph convolutional networks. In Proceedings of the 26th ACM SIGKDD International conference on knowledge discovery & data mining. 1243--1253.
[56]
Mitchell Wortsman, Gabriel Ilharco, Samir Ya Gadre, Rebecca Roelofs, Raphael Gontijo-Lopes, Ari S Morcos, Hongseok Namkoong, Ali Farhadi, Yair Carmon, Simon Kornblith, et al. 2022. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. In International Conference on Machine Learning. PMLR, 23965--23998.
[57]
Tailin Wu, Hongyu Ren, Pan Li, and Jure Leskovec. 2020b. Graph information bottleneck. Advances in Neural Information Processing Systems, Vol. 33 (2020), 20437--20448.
[58]
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020a. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, Vol. 32, 1 (2020), 4--24.
[59]
Tianpei Yang, Weixun Wang, Hongyao Tang, Jianye Hao, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yingfeng Chen, Yujing Hu, et al. 2021. An efficient transfer learning framework for multiagent reinforcement learning. Advances in Neural Information Processing Systems, Vol. 34 (2021), 17037--17048.
[60]
Bangpeng Yao, Xiaoye Jiang, Aditya Khosla, Andy Lai Lin, Leonidas Guibas, and Li Fei-Fei. 2011. Human action recognition by learning bases of action attributes and parts. In 2011 International conference on computer vision. IEEE, 1331--1338.
[61]
Shingo Yashima, Teppei Suzuki, Kohta Ishikawa, Ikuro Sato, and Rei Kawakami. 2022. Feature Space Particle Inference for Neural Network Ensembles. arXiv preprint arXiv:2206.00944 (2022).
[62]
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks? Advances in neural information processing systems, Vol. 27 (2014).
[63]
Jialin Zhao, Yuxiao Dong, Ming Ding, Evgeny Kharlamov, and Jie Tang. 2021. Adaptive Diffusion in Graph Neural Networks. Advances in Neural Information Processing Systems, Vol. 34 (2021), 23321--23333.
[64]
Cheng Zheng, Bo Zong, Wei Cheng, Dongjin Song, Jingchao Ni, Wenchao Yu, Haifeng Chen, and Wei Wang. 2020. Robust graph representation learning via neural sparsification. In International Conference on Machine Learning. PMLR, 11458--11468.
[65]
Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2020. Graph neural networks: A review of methods and applications. AI open, Vol. 1 (2020), 57--81.
[66]
Zhi-Hua Zhou and Nan Li. 2010. Multi-information ensemble diversity. In International Workshop on Multiple Classifier Systems. Springer, 134--144.
[67]
Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu Zhu, Hui Xiong, and Qing He. 2019. A Comprehensive Survey on Transfer Learning. Proc. IEEE, Vol. 109 (2019), 43--76.

Cited By

View all
  • (2025)Uncertainty-Aware Deep Neural Representations for Visual Analysis of Vector Field DataIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345636031:1(1343-1353)Online publication date: Jan-2025
  • (2024)Uncertainty-Informed Volume Visualization using Implicit Neural Representation2024 IEEE Workshop on Uncertainty Visualization: Applications, Techniques, Software, and Decision Frameworks10.1109/UncertaintyVisualization63963.2024.00013(62-72)Online publication date: 14-Oct-2024
  • (2024)Adaptive Ensembles of Fine-Tuned Transformers for LLM-Generated Text Detection2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651296(1-7)Online publication date: 30-Jun-2024
  • Show More Cited By

Index Terms

  1. Leveraging Relational Graph Neural Network for Transductive Model Ensemble

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2023
    5996 pages
    ISBN:9798400701030
    DOI:10.1145/3580305
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 August 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. graph neural networks
    2. information bottlenecks
    3. transfer learning

    Qualifiers

    • Research-article

    Funding Sources

    • MBZUAI-WIS funding; MBZUAI start-up funding

    Conference

    KDD '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)306
    • Downloads (Last 6 weeks)14
    Reflects downloads up to 08 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Uncertainty-Aware Deep Neural Representations for Visual Analysis of Vector Field DataIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345636031:1(1343-1353)Online publication date: Jan-2025
    • (2024)Uncertainty-Informed Volume Visualization using Implicit Neural Representation2024 IEEE Workshop on Uncertainty Visualization: Applications, Techniques, Software, and Decision Frameworks10.1109/UncertaintyVisualization63963.2024.00013(62-72)Online publication date: 14-Oct-2024
    • (2024)Adaptive Ensembles of Fine-Tuned Transformers for LLM-Generated Text Detection2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651296(1-7)Online publication date: 30-Jun-2024
    • (2024)Online Transfer Learning for RSV Case Detection2024 IEEE 12th International Conference on Healthcare Informatics (ICHI)10.1109/ICHI61247.2024.00074(512-521)Online publication date: 3-Jun-2024
    • (2024)Edge-centric optimization: a novel strategy for minimizing information loss in graph-to-text generationComplex & Intelligent Systems10.1007/s40747-024-01690-y11:1Online publication date: 19-Dec-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media