Integrated anchor and social link predictions across multiple social networks

Zhan, Qianyi; Zhang, Jiawei; Yu, Philip S.

doi:10.1007/s10115-018-1210-1

Integrated anchor and social link predictions across multiple social networks

Regular Paper
Published: 22 May 2018

Volume 60, pages 303–326, (2019)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Qianyi Zhan^1,4,
Jiawei Zhang² &
Philip S. Yu³

871 Accesses
16 Citations
4 Altmetric
Explore all metrics

Abstract

In recent years, various online social networks offering specific services have gained great popularity and success. To enjoy more online social services, some users can be involved in multiple social networks simultaneously. A challenging problem in social network studies is to identify the common users across networks to gain better understanding of user behavior. This is referred to as the anchor link prediction problem. Meanwhile, across these partially aligned social networks, users can be connected by different kinds of links, e.g., social links among users in one single network and anchor links between accounts of the shared users in different networks. Many different link prediction methods have been proposed so far to predict each type of links separately. In this paper, we want to predict the formation of social links among users in the target network as well as anchor links aligning the target network with other external social networks. The problem is formally defined as the “collective link identification” problem. Predicting the formation of links in social networks with traditional link prediction methods, e.g., classification-based methods, can be very challenging. The reason is that, from the network, we can only obtain the formed links (i.e., positive links) but no information about the links that will never be formed (i.e., negative links). To solve the collective link identification problem, a unified link prediction framework, collective link fusion (CLF) is proposed in this paper, which consists of two phases: step (1) collective link prediction of anchor and social links with positive and unlabeled learning techniques, and step (2) propagation of predicted links across the partially aligned “probabilistic networks” with collective random walk. Extensive experiments conducted on two real-world partially aligned networks demonstrate that CLF can perform very well in predicting social and anchor links concurrently.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TeleLink: Link Prediction in Social Network Based on Multiplex Cohesive Structures

Anchor Link Prediction Based on Trusted Anchor Re-identification

Anchor Link Prediction Using Topological Information in Social Networks

References

Adamic L, Adar E (2001) Friends and neighbors on the web. Soc Netw 25:211–230
Article Google Scholar
Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: WSDM
Chang C-C, Lin C-J (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Elkan C, Noto K (2008) Learning classifiers from only positive and unlabeled data. In: KDD
Fouss F, Pirotte A, Renders J, Saerens M (2007) Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. TKDE 19:355–369
Google Scholar
Fujiwara Y, Nakatsuji M, Onizuka M, Kitsuregawa M (2012) Fast and exact top-k search for random walk with restart. VLDB 55:442–453
Google Scholar
Getoor L, Diehl CP (2005) Link mining: a survey. SIGKDD Explor Newslett 7:3–12
Article Google Scholar
Hasan M, Chaoji V, Salem S, Zaki M (2006) Link prediction using supervised learning. In: SDM
Hasan M, Zaki MJ (2011) A survey of link prediction in social networks. In: Aggarwal CC (ed) Social network data analytics. Springer, New York
Google Scholar
Hsieh C-J, Natarajan N, Dhillon IS (2015) PU learning for matrix completion. In: ICML, pp 2445–2453
Hwang T, Kuang R (2010) A heterogeneous label propagation algorithm for disease gene discovery. In: SDM
Iofciu T, Fankhauser P, Abel F, Bischoff K (2011) Identifying users across social tagging systems. In: ICWSM
Jin S, Zhang J, Yu P, Yang S, Li A (2014) Synergistic partitioning in multiple large scale social networks. In: IEEE BigData
Kong X, Zhang J, Yu P (2013) Inferring anchor links across multiple heterogeneous social networks. In: CIKM
Konstas I, Stathopoulos V, Jose JM (2009) On social networks and collaborative recommendation. In: SIGIR
Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A Stat Mech Its Appl 390:1150–1170
Article Google Scholar
Leskovec J, Huttenlocher D, Kleinberg J (2010) Predicting positive and negative links in online social networks. In: WWW
Liben-Nowell D, Kleinberg J (2003) The link prediction problem for social networks. In: CIKM
Liu B, Dai Y, Li X, Lee W, Yu P (2003) Building text classifiers using positive and unlabeled examples. In: ICDM
Liu J, Zhang F, Song X, Song Y, Lin C, Hon H (2013) What’s in a name? An unsupervised approach to link users across communities. In: WSDM
Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A Stat Mech Its Appl 390(6):1150–1170
Article Google Scholar
Namata G, Kok S, Getoor L (2011) Collective graph identification. In: KDD
Perkins D, Salomon G (1992) Transfer of learning Pergamon Press, Oxford, England
Sahraeian S, Yoon B (2013) Smetana: accurate and scalable algorithm for probabilistic alignment of large-scale biological networks. PLoS ONE 8:e67995
Article Google Scholar
Song D, Meyer D (2014) A model of consistent node types in signed directed social networks. In: ASONAM ’14 Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, IEEE Press, Piscataway, NJ, USA, pp 72–80
Tong H, Faloutsos C, Pan J (2006) Fast random walk with restart and its applications. In: ICDM
Wilcox K, Stephen AT (2012) Are close friends the enemy? Online social networks, self-esteem, and self-control. J Consum Res 40:90–103
Article Google Scholar
Xi W, Zhang B, Chen Z, Lu Y, Yan S, Ma W, Fox E (2004) Link fusion: a unified link analysis framework for multi-type interrelated data objects. In: WWW
Xiang R, Neville J, Rogati M (2010) Modeling relationship strength in online social networks. In: WWW
Yao Y, Tong H, Yan X, Xu F, Lu J (2013) Matri: a multi-aspect and transitive trust inference model. In: WWW
Ye J, Cheng H, Zhu Z, Chen M (2013) Predicting positive and negative links in signed social networks by transfer learning. In: WWW
Zafarani R, Liu H (2009) Connecting corresponding identities across communities. In: ICWSM
Zhan Q, Wang S, Zhang J, Yu P, Xie J (2015) Influence maximization across partially aligned heterogenous social networks. In: PAKDD
Zhang J, Kong X, Yu P (2013) Predicting social links for new users across aligned heterogeneous social networks. In: ICDM
Zhang J, Kong X, Yu P (2014) Transferring heterogeneous links across location-based social networks. In: WSDM
Zhang J, Shao W, Wang S, Kong X, Yu P (2015) Pna: Partial network alignment with generic stable matching. In: IEEE IRI
Zhang J, Yu P (2015) Community detection for emerging networks. In: SDM
Zhang J, Yu P (2015) Mcd: Mutual clustering across multiple heterogeneous networks. In: IEEE BigData Congress
Zhang J, Yu P, Zhou Z (2014) Meta-path based multi-network collective link prediction. In: KDD
Zhao Y, Kong X, Yu P (2011) Positive and unlabeled learning for graph classification. In: ICDM

Download references

Acknowledgements

This work is supported by the Fundamental Research Funds for the Central Universities under grant JUSRP11852. This work was partially supported by Florida State University Council on Research and Creativity (CRC) via the Project ID 041776. This work is also supported in part by NSF through Grants IIS-1526499, IIS-1763325, CNS-1626432 and NSFC 61672313. The views and conclusions are those of the authors and should not be interpreted as representing the official policies of the funding agencies or the government.

Author information

Authors and Affiliations

School of Digital Media, Jiangnan University, Wuxi, China
Qianyi Zhan
IFM Lab, Department of Computer Science, Florida State University, Tallahassee, FL, USA
Jiawei Zhang
University of Illinois at Chicago, Chicago, IL, USA
Philip S. Yu
Jiangsu Key Laboratory of Media Design and Software Technology, Wuxi, China
Qianyi Zhan

Authors

Qianyi Zhan
View author publications
You can also search for this author inPubMed Google Scholar
Jiawei Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Philip S. Yu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Qianyi Zhan.

Additional information

A preliminary version of this work appeared in: Proceedings of International Joint Conferences on Artificial Intelligence (IJCAI ’15), 2015.

Appendix

Social features of anchor links have been introduced in previous part, in this part, we will introduce the social features of social links and spatial distribution features, temporal distribution features and text usage features of both anchor links and social links.

1.1 6.1. Social features

See Table 3.

Table 3 Social features defined for social link (u, v)

Full size table

$\Gamma (u)$ is the set of neighbors of user u.

In addition to social information, we also extract features from users’ location check-ins. For a certain anchor/social link (u, v), we can get the locations that u and v have been to $\Phi (u)$ and $\Phi (v)$, respectively. Since each user can visit a location many times, we construct vector l(u) and l(v) for u and v, respectively, each cell in which record the times that u and v visit a certain location in $\Phi (u) \cup \Phi (v)$.

1.2 6.2. Spatial distribution features

See Table 4.

Table 4 Spatial distribution features for link (u, v)

Full size table

Similarly, we can get the set of locations that u has visited from the networks, $\Phi (u)$. For a certain anchor/social link (u, v), we can extract the spatial distribution features for it with those summarized in Table 3 except the “Adamic/Adar” measure based on $\Phi (u)$ and $\Phi (u)$.

1.3 6.3. Temporal distribution features

See Table 5.

Table 5 Other frequently features for link (u, v)

Full size table

Users’ temporal activity information is also used to extract features for link (u, v). Each day is divided into 24 h slots, and the number of online posts published at certain hours is stored in vector ${\mathbf {x}}(u)$ and ${\mathbf {x}}(v)$, from which we can extract $IP({\mathbf {x}}(u), {\mathbf {x}}(v))$, $ED({\mathbf {x}}(u), {\mathbf {x}}(v))$ and $CS({\mathbf {x}}(u), {\mathbf {x}}(v))$ summarized in Table 5 as the temporal distribution features of link (u, v).

1.4 6.4. Text usage features

For a certain link (u, v), we can get the words that u and v have used in the past and group them as two bag-of-words vectors, ${\mathbf {x}}(u)$ and ${\mathbf {x}}(v)$, weighted by TF-IDF. From ${\mathbf {x}}(u)$ and ${\mathbf {x}}(v)$, we also extract $IP({\mathbf {x}}(u), {\mathbf {x}}(v))$, $ED({\mathbf {x}}(u), {\mathbf {x}}(v))$ and $CS({\mathbf {x}}(u), {\mathbf {x}}(v))$ summarized in Table 5 as the text usage features of link (u, v).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhan, Q., Zhang, J. & Yu, P.S. Integrated anchor and social link predictions across multiple social networks. Knowl Inf Syst 60, 303–326 (2019). https://doi.org/10.1007/s10115-018-1210-1

Download citation

Received: 26 October 2016
Revised: 24 January 2018
Accepted: 06 May 2018
Published: 22 May 2018
Issue Date: 01 July 2019
DOI: https://doi.org/10.1007/s10115-018-1210-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integrated anchor and social link predictions across multiple social networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

TeleLink: Link Prediction in Social Network Based on Multiplex Cohesive Structures

Anchor Link Prediction Based on Trusted Anchor Re-identification

Anchor Link Prediction Using Topological Information in Social Networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

1.1 6.1. Social features

1.2 6.2. Spatial distribution features

1.3 6.3. Temporal distribution features

1.4 6.4. Text usage features

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now