Abstract
In recent years, relationship prediction in heterogeneous information networks (HINs) has become an active topic. The most essential part of this task is how to effectively represent and utilize the important three kinds of information hidden in connections of the network, namely local structure information (Local-info), global structure information (Global-info) and attribute information (Attr-info). Although all the information indicates different features of the network and influence relationship creation in a complementary way, existing approaches utilize them separately or in a partially combined way. In this article, a novel framework named Supervised Ranking framework (S-Rank) is proposed to tackle this issue. To avoid the class imbalance problem, in S-Rank framework we treat the relationship prediction problem as a ranking task and divide it into three phases. Firstly, a Supervised PageRank strategy (SPR) is proposed to rank the candidate nodes according to Global-info and Attr-info. Secondly, a Meta Path-based Ranking method (MPR) utilizing Local-info is proposed to rank the candidate nodes based on their meta path-based features. Finally, the two ranking scores are linearly integrated into the final ranking result which combines all the Attr-info, Global-info and Local-info together. Experiments on DBLP data demonstrate that the proposed S-Rank framework can effectively take advantage of all the three kinds of information for relationship prediction over HINs and outperforms other well-known baseline approaches.
Similar content being viewed by others
Notes
Available at https://aminer.org/dblp_citation.
Remind that SRW here is distinct from SRW method mentioned in Table 1.
References
Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: The fourth ACM international conference on Web search and data mining. ACM, pp 635–644
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: The eleventh annual conference on computational learning theory. ACM, pp 92–100
Cao B, Kong X, Yu P S (2014) Collective prediction of multiple types of links in heterogeneous information networks. In: ICDM, pp 50–59
Cao X, Zheng Y, Shi C, Li J, Wu B (2016) Link prediction in schema-rich heterogeneous information network. In: Advances in knowledge discovery and data mining - 20th Pacific-Asia conference, PAKDD 2016, Auckland, New Zealand, April 19-22, 2016, Proceedings, Part I, pp 449–460
Deng Z H, Lai B Y, Wang Z H, Fang G D (2012) Pav: a novel model for ranking heterogeneous objects in bibliographic information networks. Expert Syst Appl 39(10):9788–9796
Fan R, Chang K, Hsieh C, Wang X, Lin C (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874
Gao B, Liu T, Wei W, Wang T, Li H (2011) Semi-supervised ranking on very large graphs with rich metadata. In: SIGKDD, pp 96–104
Han J (2012) Mining heterogeneous information networks: the next frontier. In: SIGKDD. ACM, pp 2–3
Hand D J, Till R J (2001) A simple generalisation of the area under the roc curve for multiple class classification problems, pp 171–186
He J, Bailey J, Zhang R (2014) Exploiting transitive similarity and temporal dynamics for similarity search in heterogeneous information networks. In: DASFAA, pp 141–155
Kautz H, Selman B, Shah M (1997) Referral web: combining social networks and collaborative filtering. Commun ACM 40(3):63–65
Kong X, Yu P S, Ding Y, Wild D J (2012) Meta path-based collective classification in heterogeneous information networks. In: The 21st ACM international conference on information and knowledge management. ACM, pp 1567–1571
Lee J B, Adorna H (2012) Link prediction in a modified heterogeneous bibliographic network. In: ASONAM. IEEE, pp 442– 449
Liang W, He X, Tang D, Zhang X (2016) S-rank: a supervised ranking framework for relationship prediction in heterogeneous information networks. Lecture notes in computer science, vol 9799. Springer, pp 305–319
Liben-Nowell D, Kleinberg JM (2003) The link prediction problem for social networks. In: Proceedings of the 2003 ACM CIKM international conference on information and knowledge management. New Orleans, pp 556–559
Ma Y, Yang N, Li C, Zhang L, Yu P S (2015) Predicting neighbor distribution in heterogeneous information networks. In: Proceedings of the 2015 SIAM international conference on data mining. Vancouver, pp 784–791
Ma Z, Dai Q (2016) Selected an stacking elms for time series prediction. Neural Process Lett 44:831–856
Ma Z, Dai Q, Liu N (2015) Several novel evaluation measures for rank-based ensemble pruning with applications to time series prediction. Expert Syst Appl 42:280–292
Menon A K, Elkan C (2011) Link prediction via matrix factorization. In: ECML/PKDD (2), pp 437–452
Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web
Rajkumar A, Agarwal S (2014) A statistical convergence perspective of algorithms for rank aggregation from pairwise data. In: ICML, pp 118–126
Shen W, Han J, Wang J (2014) A probabilistic model for linking named entities in web text with heterogeneous information networks. In: SIGMOD, pp 1199–1210
Shi B, Weninger T (2016) Fact checking in heterogeneous information networks. In: Proceedings of the 25th international conference on World Wide Web, WWW 2016, Montreal, Canada, April 11-15, 2016, Companion Volume, pp 101–102
Shi C, Zhang Z, Luo P, Yu P S, Yue Y, Wu B (2015) Semantic path based personalized recommendation on weighted heterogeneous information networks. In: Proceedings of the 24th ACM international on conference on information and knowledge management, CIKM 2015, Melbourne, VIC, Australia, October 19 - 23, 2015, pp 453–462
Sun Y, Barber R, Gupta M, Aggarwal C C, Han J (2011) Co-author relationship prediction in heterogeneous bibliographic networks. In: ASONAM. IEEE, pp 121–128
Sun Y, Han J, Yan X, Yu P S, Wu T (2011) Pathsim: meta path-based top-k similarity search in heterogeneous information networks. PVLDB 4(11):992–1003
Sun Y, Han J, Aggarwal C C, Chawla N V (2012) When will it happen? Relationship prediction in heterogeneous information networks. In: WSDM, pp. 663–672
Tang J, Lou T, Kleinberg J (2012) Inferring social ties across heterogenous networks. In: The fifth ACM international conference on Web search and data mining. ACM, pp 743–752
Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: SIGKDD, pp 990–998
Tang W, Zhuang H, Tang J (2011) Learning to infer social ties in large networks. In: Machine learning and knowledge discovery in databases - European conference, ECML PKDD 2011, Athens, Greece, September 5-9, 2011, Proceedings, Part III, pp 381–397
Wang C, Song Y, Li H, Zhang M, Han J (2016) Text classification with heterogeneous information network kernels. In: Proceedings of the thirtieth AAAI conference on artificial intelligence. Phoenix, pp 2130–2136
Yan L, Dodier R H, Mozer M, Wolniewicz R H (2003) Optimizing classifier performance via an approximation to the wilcoxon-mann-whitney statistic. In: ICML, pp 848–855
Yin Z, Gupta M, Weninger T, Han J (2010) A unified framework for link recommendation using random walks. In: ASONAM. IEEE, pp 152–159
Yu X, Gu Q, Zhou M, Han J (2012) Citation prediction in heterogeneous bibliographic networks. In: SDM. SIAM, pp 1119–1130
Yu X, Ren X, Sun Y, Gu Q, Sturt B, Khandelwal U, Norick B, Han J (2014) Personalized entity recommendation: a heterogeneous information network approach. In: Seventh ACM international conference on web search and data mining, WSDM 2014. New York, pp 283–292
Acknowledgments
This work was partially supported by National High Technology Research and Development Program (863 Program) of China (No. 2015AA015403) and National Science Foundation of China (No. 61632019).
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary version of this article appeared in proceedings of the 29th International Conference on Industrial, Engineering and Other Applications of Applied Intelligence Systems (IEA/AIE 2016) [14].
Rights and permissions
About this article
Cite this article
Liang, W., Li, X., He, X. et al. Supervised ranking framework for relationship prediction in heterogeneous information networks. Appl Intell 48, 1111–1127 (2018). https://doi.org/10.1007/s10489-017-1044-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-017-1044-7