skip to main content
10.1145/3269206.3271777acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Are Meta-Paths Necessary?: Revisiting Heterogeneous Graph Embeddings

Published: 17 October 2018 Publication History

Abstract

The graph embedding paradigm projects nodes of a graph into a vector space, which can facilitate various downstream graph analysis tasks such as node classification and clustering. To efficiently learn node embeddings from a graph, graph embedding techniques usually preserve the proximity between node pairs sampled from the graph using random walks. In the context of a heterogeneous graph, which contains nodes from different domains, classical random walks are biased towards highly visible domains where nodes are associated with a dominant number of paths. To overcome this bias, existing heterogeneous graph embedding techniques typically rely on meta-paths (i.e., fixed sequences of node types) to guide random walks. However, using these meta-paths either requires prior knowledge from domain experts for optimal meta-path selection, or requires extended computations to combine all meta-paths shorter than a predefined length. In this paper, we propose an alternative solution that does not involve any meta-path. Specifically, we propose JUST, a heterogeneous graph embedding technique using random walks with JUmp and STay strategies to overcome the aforementioned bias in an more efficient manner. JUST can not only gracefully balance between homogeneous and heterogeneous edges, it can also balance the node distribution over different domains (i.e., node types). By conducting a thorough empirical evaluation of our method on three heterogeneous graph datasets, we show the superiority of our proposed technique. In particular, compared to a state-of-the-art heterogeneous graph embedding technique Hin2vec, which tries to optimally combine all meta-paths shorter than a predefined length, our technique yields better results in most experiments, with a dramatically reduced embedding learning time (about 3x speedup).

References

[1]
David Arthur and Sergei Vassilvitskii. 2007. k-means
[2]
: The advantages of careful seeding. In SIAM. Society for Industrial and Applied Mathematics, 1027--1035.
[3]
Hongyun Cai, Vincent W Zheng, and Kevin Chang. 2018. A comprehensive survey of graph embedding: problems, techniques and applications. IEEE Transactions on Knowledge and Data Engineering (2018).
[4]
Shaosheng Cao, Wei Lu, and Qiongkai Xu. 2015. Grarep: Learning graph representations with global structural information. In CIKM. ACM, 891--900.
[5]
Shiyu Chang, Wei Han, Jiliang Tang, Guo-Jun Qi, Charu C Aggarwal, and Thomas S Huang. 2015. Heterogeneous network embedding via deep architectures. In KDD. ACM, 119--128.
[6]
Ting Chen and Yizhou Sun. 2017. Task-guided and path-augmented heterogeneous network embedding for author identification. In WSDM. ACM, 295--304.
[7]
Peng Cui, Xiao Wang, Jian Pei, and Wenwu Zhu. 2017. A Survey on Network Embedding. arXiv preprint arXiv:1711.08752 (2017).
[8]
Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In KDD. ACM, 135--144.
[9]
Tao-yang Fu, Wang-Chien Lee, and Zhen Lei. 2017. HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning. In KDD. ACM, 1797--1806.
[10]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In KDD. ACM, 855--864.
[11]
Huan Gui, Jialu Liu, Fangbo Tao, Meng Jiang, Brandon Norick, and Jiawei Han. 2016. Large-scale embedding learning in heterogeneous event data. In ICDM. IEEE, 907--912.
[12]
Zhipeng Huang and Nikos Mamoulis. 2017. Heterogeneous Information Network Embedding for Meta Path based Proximity. arXiv preprint arXiv:1701.05291 (2017).
[13]
Andrea Lancichinetti, Santo Fortunato, and János Kertész. 2009. Detecting the overlapping and hierarchical community structure in complex networks. New Journal of Physics, Vol. 11, 3 (2009), 033015.
[14]
Changping Meng, Reynold Cheng, Silviu Maniu, Pierre Senellart, and Wangda Zhang. 2015. Discovering meta-paths in large heterogeneous information networks. In WWW. International World Wide Web Conferences Steering Committee, 754--764.
[15]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
[16]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013b. Distributed representations of words and phrases and their compositionality. In NIPS . 3111--3119.
[17]
Jae Dong Noh and Heiko Rieger. 2004. Random walks on complex networks. Physical review letters, Vol. 92, 11 (2004), 118701.
[18]
Mingdong Ou, Peng Cui, Jian Pei, Ziwei Zhang, and Wenwu Zhu. 2016. Asymmetric transitivity preserving graph embedding. In KDD. ACM, 1105--1114.
[19]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In KDD. ACM, 701--710.
[20]
Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. 2011. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In NIPS . 693--701.
[21]
Jingbo Shang, Meng Qu, Jialu Liu, Lance M Kaplan, Jiawei Han, and Jian Peng. 2016. Meta-Path Guided Embedding for Similarity Search in Large-Scale Heterogeneous Information Networks. arXiv preprint arXiv:1610.09769 (2016).
[22]
Chuan Shi, Binbin Hu, Wayne Xin Zhao, and Philip S Yu. 2017. Heterogeneous Information Network Embedding for Recommendation. arXiv preprint arXiv:1711.10730 (2017).
[23]
Yizhou Sun and Jiawei Han. 2013. Mining heterogeneous information networks: a structural analysis approach. Acm Sigkdd Explorations Newsletter, Vol. 14, 2 (2013), 20--28.
[24]
Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S Yu, and Tianyi Wu. 2011. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proceedings of the VLDB Endowment, Vol. 4, 11 (2011), 992--1003.
[25]
Jian Tang, Meng Qu, and Qiaozhu Mei. 2015a. Pte: Predictive text embedding through large-scale heterogeneous text networks. In KDD. ACM, 1165--1174.
[26]
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015b. Line: Large-scale information network embedding. In WWW. International World Wide Web Conferences Steering Committee, 1067--1077.
[27]
Linchuan Xu, Xiaokai Wei, Jiannong Cao, and Philip S Yu. 2017. Embedding of embedding (eoe): Joint embedding for coupled heterogeneous networks. In WSDM. ACM, 741--749.
[28]
Dingqi Yang, Daqing Zhang, Longbiao Chen, and Bingqing Qu. 2015a. NationTelescope: Monitoring and visualizing large-scale collective behavior in LBSNs. Journal of Network and Computer Applications, Vol. 55 (2015), 170--180.
[29]
Dingqi Yang, Daqing Zhang, and Bingqing Qu. 2016. Participatory cultural mapping based on collective behavior data in location-based social networks. ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 7, 3 (2016), 30.
[30]
Dingqi Yang, Daqing Zhang, Zhiyong Yu, and Zhu Wang. 2013. A sentiment-enhanced personalized location recommendation system. In HT. ACM, 119--128.
[31]
Dingqi Yang, Daqing Zhang, Vincent W Zheng, and Zhiyong Yu. 2015b. Modeling user activity preference by leveraging user spatial temporal characteristics in LBSNs. IEEE Transactions on Systems, Man, and Cybernetics: Systems, Vol. 45, 1 (2015), 129--142.

Cited By

View all
  • (2025)Two-Dimensional Balanced Partitioning and Efficient Caching for Distributed Graph AnalysisIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.350129236:2(133-149)Online publication date: Feb-2025
  • (2025)Disentangled hyperbolic representation learning for heterogeneous graphsKnowledge-Based Systems10.1016/j.knosys.2025.112976310(112976)Online publication date: Feb-2025
  • (2025)Navigating complexity: a comprehensive review of heterogeneous information networks and embedding techniquesKnowledge and Information Systems10.1007/s10115-025-02357-xOnline publication date: 13-Feb-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
October 2018
2362 pages
ISBN:9781450360142
DOI:10.1145/3269206
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph embedding
  2. heterogeneous graph
  3. random walk

Qualifiers

  • Research-article

Funding Sources

Conference

CIKM '18
Sponsor:

Acceptance Rates

CIKM '18 Paper Acceptance Rate 147 of 826 submissions, 18%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)88
  • Downloads (Last 6 weeks)6
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Two-Dimensional Balanced Partitioning and Efficient Caching for Distributed Graph AnalysisIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.350129236:2(133-149)Online publication date: Feb-2025
  • (2025)Disentangled hyperbolic representation learning for heterogeneous graphsKnowledge-Based Systems10.1016/j.knosys.2025.112976310(112976)Online publication date: Feb-2025
  • (2025)Navigating complexity: a comprehensive review of heterogeneous information networks and embedding techniquesKnowledge and Information Systems10.1007/s10115-025-02357-xOnline publication date: 13-Feb-2025
  • (2024)Graph embedding on mass spectrometry- and sequencing-based biomedical dataBMC Bioinformatics10.1186/s12859-023-05612-625:1Online publication date: 2-Jan-2024
  • (2024)Schema-Aware Hyper-Relational Knowledge Graph Embeddings for Link PredictionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3323499(1-15)Online publication date: 2024
  • (2024)Community Enhanced Knowledge Graph for RecommendationIEEE Transactions on Computational Social Systems10.1109/TCSS.2024.338360311:5(5789-5802)Online publication date: Oct-2024
  • (2024)A Heterogeneous Graph Neural Network With Attribute Enhancement and Structure-Aware AttentionIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.323903411:1(829-838)Online publication date: Feb-2024
  • (2024)Drug-Target Prediction Based on Dynamic Heterogeneous Graph Convolutional NetworkIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2024.344132428:11(6997-7005)Online publication date: Nov-2024
  • (2024)Zero-shot Heterogeneous Graph Embedding via Aggregating Metapath Semantically2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650430(1-8)Online publication date: 30-Jun-2024
  • (2024)DAHGN: Degree-Aware Heterogeneous Graph Neural NetworkKnowledge-Based Systems10.1016/j.knosys.2023.111355285(111355)Online publication date: Feb-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media