Abstract
Representation learning on textual network or textual network embedding, which leverages rich textual information associated with the network structure to learn low-dimensional embedding of vertices, has been useful in a variety of tasks. However, most approaches learn textual network embedding by using direct neighbors. In this paper, we employ a powerful and spatially localized operation: personalized PageRank (PPR) to eliminate the restriction of using only the direct connection relationship. Also, we analyze the relationship between PPR and spectral-domain theory, which provides insight into the empirical performance boost. From the experiment, we discovered that the proposed method provides a great improvement in link-prediction tasks, when compared to existing methods, achieving a new state-of-the-art on several real-world benchmark datasets.
Similar content being viewed by others
References
Xu R F, Du J C, Zhao Z S, et al. Inferring user profiles in social media by joint modeling of text and networks. Sci China Inf Sci, 2019, 62: 219104
Ng A Y, Jordan M I, Weiss Y. On spectral clustering: analysis and an algorithm. In: Proceedings of Advances in Neural Information Processing Systems 14, Vancouver, 2001. 849–856
Zhang Q, Li R, Chu T G. Kernel semi-supervised graph embedding model for multimodal and mixmodal data. Sci China Inf Sci, 2020, 63: 119204
Perozzi B, Al-Rfou R, Skiena S. Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, 2014. 701–710
Tang J, Qu M, Wang M Z, et al. LINE: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web, Florence, 2015. 1067–1077
Grover A, Leskovec J. Node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, 2016. 855–864
Qiu J Z, Dong Y X, Ma H, et al. Network embedding as matrix factorization: unifying deepwalk, line, PTE, and node2vec. In: Proceedings of the 11th ACM International Conference on Web Search and Data Mining, Marina Del Rey, 2018. 459–467
Yang C, Liu Z Y, Zhao D L, et al. Network representation learning with rich text information. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence, Buenos Aires, 2015. 2111–2117
Sun X F, Guo J, Ding X, et al. A general framework for content-enhanced network representation learning. 2016. ArXiv:1610.02906
Tu C C, Liu H, Liu Z Y, et al. CANE: context-aware network embedding for relation modeling. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, 2017. 1722–1731
Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of Advances in Neural Information Processing Systems 26, Lake Tahoe, 2013. 3111–3119
Page L, Brin S, Motwani R, et al. The pagerank citation ranking: bringing order to the web. 1999. http://courses.washington.edu/ir2010/readings/page.pdf
Kim Y. Convolutional neural networks for sentence classification. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Doha, 2014. 1746–1751
Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations, Toulon, 2017
von Luxburg U. A tutorial on spectral clustering. Stat Comput, 2007, 17: 395–416
Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering. In: Proceedings of Advances in Neural Information Processing Systems 29, Barcelona, 2016. 3837–3845
Chung F. The heat kernel as the pagerank of a graph. Proc Natl Acad Sci USA, 2007, 104: 19735–19740
Kingma D P, Ba J. Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations, San Diego, 2015
van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res, 2008, 9: 2579–2605
Wang S B, Yang R C, Xiao X K, et al. FORA: simple and effective approximate single-source personalized pagerank. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, 2017. 505–514
Wei Z W, He X D, Xiao X K, et al. TopPPR: top-k personalized pagerank queries with precision guarantees on large graphs. In: Proceedings of International Conference on Management of Data, Houston, 2018. 441–456
Acknowledgements
This work was supported by National Science and Technology Major Projects on Core Electronic Devices, High-End Generic Chips and Basic Software (Grant No. 2018ZX01028101) and National Natural Science Foundation of China (Grant No. 61732018). The authors acknowledge the anonymous reviewers for their valuable comments, which improve the quality of this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, T., Dou, Y. Representation learning on textual network with personalized PageRank. Sci. China Inf. Sci. 64, 212102 (2021). https://doi.org/10.1007/s11432-020-2934-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-020-2934-6