research-article

COSNET: Connecting Heterogeneous Social Networks with Local and Global Consistency

Authors:

Philip S. YuAuthors Info & Claims

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 1485 - 1494

https://doi.org/10.1145/2783258.2783268

Published: 10 August 2015 Publication History

Abstract

More often than not, people are active in more than one social network. Identifying users from multiple heterogeneous social networks and integrating the different networks is a fundamental issue in many applications. The existing methods tackle this problem by estimating pairwise similarity between users in two networks. However, those methods suffer from potential inconsistency of matchings between multiple networks.

In this paper, we propose COSNET (COnnecting heterogeneous Social NETworks with local and global consistency), a novel energy-based model, to address this problem by considering both local and global consistency among multiple networks. An efficient subgradient algorithm is developed to train the model by converting the original energy-based objective function into its dual form.

We evaluate the proposed model on two different genres of data collections: SNS and Academia, each consisting of multiple heterogeneous social networks. Our experimental results validate the effectiveness and efficiency of the proposed model. On both data collections, the proposed COSNET method significantly outperforms several alternative methods by up to 10-30% (p << 0:001, t-test) in terms of F1-score. We also demonstrate that applying the integration results produced by our method can improve the accuracy of expert finding, an important task in social networks.

Supplementary Material

MP4 File (p1485.mp4)

Download
188.94 MB

References

[1]

R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network Flows: Theory, Algorithms, and Applications. Prentice Hall, 1993.

Digital Library

[2]

L. Backstrom, C. Dwork, and J. M. Kleinberg. Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In WWW'07, pages 181--190, 2007.

Digital Library

[3]

X. Bai, F. P. Junqueira, and S. H. Sengamedu. Exploiting user clicks for automatic seed set generation for entity matching. In KDD'13, pages 980--988, 2013.

Digital Library

[4]

K. Bellare, S. Iyengar, A. G. Parameswaran, and V. Rastogi. Active sampling for entity matching. In KDD'12, pages 1131--1139, 2012.

Digital Library

[5]

I. Bhattacharya and L. Getoor. Collective entity resolution in relational data. ACM Transactions on Knowledge Discovery from Data, 1(1):1--36, March 2007.

Digital Library

[6]

C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In SIGIR'2004, pages 25--32, 2004.

Digital Library

[7]

W. Chen, Z. Liu, X. Sun, and Y. Wang. A game-theoretic framework to identify overlapping communities in social networks. Data Mining and Knowledge Discovery, 21(2):224--240, 2010.

Digital Library

[8]

W. W. Cohen, P. Ravikumar, and S. E. Fienberg. A comparison of string metrics for matching names and records. In Proceedings of the IJCAI-2003 Workshop on Information Integration on the Web, pages 73--78, 2003.

[9]

S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In EMNLP-CoNLL'07, volume 6, pages 708--716, 2007.

[10]

Y. Cui, J. Pei, G. Tang, W.-S. Luk, D. Jiang, and M. Hua. Finding email correspondents in online social networks. World Wide Web, 16(2):195--218, 2013.

Digital Library

[11]

R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. MIT Press, Cambridge, MA, 2000.

[12]

S. Kataria, K. S. Kumar, R. Rastogi, P. Sen, and S. H. Sengamedu. Entity disambiguation with hierarchical topic models. In KDD'11, pages 1037--1045, 2011.

Digital Library

[13]

N. Komodakis. Efficient training for pairwise or higher order crfs via dual decomposition. In CVPR'11, pages 1841--1848, 2011.

Digital Library

[14]

N. Komodakis, N. Paragios, and G. Tziritas. Mrf energy minimization and beyond via dual decomposition. IEEE Trans. Pattern Anal. Mach. Intell., 2011.

Digital Library

[15]

X. Kong, J. Zhang, and S. Y. Philip. Inferring anchor links across multiple heterogeneous social networks. In CIKM'13, pages 179--188, 2013.

Digital Library

[16]

H. Kwak, C. Lee, H. Park, and S. B. Moon. What is twitter, a social network or a news media? In WWW'10, pages 591--600, 2010.

Digital Library

[17]

S. Lacoste-Julien, K. Palla, A. Davies, G. Kasneci, T. Graepel, and Z. Ghahramani. Sigma: Simple greedy matching for aligning large knowledge bases. In KDD'13, pages 572--580, 2013.

Digital Library

[18]

Y. LeCun, S. Chopra, and R. Hadsell. A tutorial on energy-based learning. 2006 CIAR Summer School: Neural Computation & Adaptive Perception, 2006.

[19]

J. Li, J. Tang, Y. Li, and Q. Luo. Rimom: A dynamic multi-strategy ontology alignment framework. IEEE TKDE, 21(8):1218--1232, 2009.

Digital Library

[20]

Y. Li, C. Wang, F. Han, J. Han, D. Roth, and X. Yan. Mining evidences for named entity disambiguation. In KDD'13, pages 1070--1078, 2013.

Digital Library

[21]

J. Liu, F. Zhang, X. Song, Y.-I. Song, C.-Y. Lin, and H.-W. Hon. What's in a name?: an unsupervised approach to link users across communities. In WSDM'13, pages 495--504, 2013.

Digital Library

[22]

S. Liu, S. Wang, F. Zhu, J. Zhang, and R. Krishnan. Hydra: Large-scale social identity linkage via heterogeneous behavior modeling. In SIGMOD'14, pages 51--62, 2014.

Digital Library

[23]

Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed graphlab: a framework for machine learning and data mining in the cloud. VLDB'12, 5(8):716--727, 2012.

Digital Library

[24]

H. Ma, H. Yang, M. R. Lyu, and I. King. Sorec: social recommendation using probabilistic matrix factorization. In CIKM'08, pages 931--940, 2008.

Digital Library

[25]

A. Maslow. A theory of human motivation. Psychological Review, 50(4):370--396, 1943.

[26]

A. Narayanan and V. Shmatikov. De-anonymizing social networks. In IEEE Symposium on Security and Privacy'09, pages 173--187, 2009.

Digital Library

[27]

D. Perito, C. Castelluccia, M. A. Kaafar, and P. Manils. How unique and traceable are usernames? In Privacy Enhancing Technologies, pages 1--17, 2011.

Digital Library

[28]

W. Shen, J. Wang, P. Luo, and M. Wang. Linking named entities in tweets with knowledge base via user interest modeling. In KDD'13, pages 68--76, 2013.

Digital Library

[29]

J. Tang, A. Fong, B. Wang, and J. Zhang. A unified probabilistic framework for name disambiguation in digital library. IEEE TKDE, 24(6):975--987, 2012.

Digital Library

[30]

J. Tang, H. Gao, H. Liu, and A. D. Sarma. eTrust: Understanding trust evolution in an online world. In KDD'12, pages 253--261, 2012.

Digital Library

[31]

J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: Extraction and mining of academic social networks. In KDD'08, pages 990--998, 2008.

Digital Library

[32]

W. Tang, J. Tang, T. Lei, C. Tan, B. Gao, and T. Li. On optimization of expertise matching with various constraints. Neurocomputing, 76(1):71--83, 2012.

Digital Library

[33]

B. Taskar, C. Guestrin, and D. Koller. Max-margin markov networks. NIPS'04, 16, 2004.

[34]

H. Whitney. Congruent graphs and the connectivity of graphs. American Journal of Mathematics, 54(1):150--168, 1932.

[35]

S. Wu, J. M. Hofman, W. A. Mason, and D. J. Watts. Who says what to whom on twitter. In WWW'11, pages 705--714, 2011.

Digital Library

[36]

L. Yartseva and M. Grossglauser. On the performance of percolation graph matching. In COSN'13, pages 119--130, 2013.

Digital Library

[37]

R. Zafarani and H. Liu. Connecting corresponding identities across communities. In ICWSM'09, pages 354--357, 2009.

[38]

R. Zafarani and H. Liu. Connecting users across social media sites: A behavioral-modeling approach. In KDD'13, pages 41--49, 2013.

Digital Library

[39]

J. Zhang, J. Tang, and J. Li. Expert finding in a social network. In DASFAA'07, pages 1066--1069, 2007.

Cited By

Salleh RNordin SMoughal WAbbasi HChing PAdnan N(2025)The Role of Social Environmental Networks in Influencing Environmental Knowledge and Environmental Awareness Towards Education for Sustainable Development in Malaysia and JapanHigher Education Quarterly10.1111/hequ.7000979:1Online publication date: 27-Jan-2025
https://doi.org/10.1111/hequ.70009
Jiang SQiu YMo XTang RWang W(2025)An Effective Node Injection Approach for Attacking Social Network AlignmentIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.351584220(589-604)Online publication date: 2025
https://doi.org/10.1109/TIFS.2024.3515842
Tang RYong ZJiang SChen XLiu YZhang YSun GWang W(2025)Network alignmentPhysics Reports10.1016/j.physrep.2024.11.0061107(1-45)Online publication date: Mar-2025
https://doi.org/10.1016/j.physrep.2024.11.006
Show More Cited By

Index Terms

COSNET: Connecting Heterogeneous Social Networks with Local and Global Consistency
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

User Identity Linkage across Online Social Networks: A Review

The increasing popularity and diversity of social media sites has encouraged more and more people to participate on multiple online social networks to enjoy their services. Each user may create a user identity, which can includes profile, content, or ...
Unifying Virtual and Physical Worlds: Learning Toward Local and Global Consistency

Event-based social networking services, such as Meetup, are capable of linking online virtual interactions to offline physical activities. Compared to mono online social networking services (e.g., Twitter and Google+), such dual networks provide a ...
DeepWalk: online learning of social representations
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 2015

2378 pages

ISBN:9781450336642

DOI:10.1145/2783258

General Chairs:
Longbing Cao
University of Technology, Sydney
,
Chengqi Zhang
University of Technology, Sydney
,
Program Chairs:
Thorsten Joachims
Cornell University
,
Geoff Webb
Monash University
,
Dragos D. Margineantu
Boeing Research
,
Graham Williams
Australian Taxation Office

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 August 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Basic Research Program of China
Natural Science Foundation of China
National High-tech R\&D Program

Conference

KDD '15

Sponsor:

KDD '15: The 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 10 - 13, 2015

NSW, Sydney, Australia

Acceptance Rates

KDD '15 Paper Acceptance Rate 160 of 819 submissions, 20%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

242
Total Citations
View Citations
1,542
Total Downloads

Downloads (Last 12 months)112
Downloads (Last 6 weeks)11

Reflects downloads up to 23 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Salleh RNordin SMoughal WAbbasi HChing PAdnan N(2025)The Role of Social Environmental Networks in Influencing Environmental Knowledge and Environmental Awareness Towards Education for Sustainable Development in Malaysia and JapanHigher Education Quarterly10.1111/hequ.7000979:1Online publication date: 27-Jan-2025
https://doi.org/10.1111/hequ.70009
Jiang SQiu YMo XTang RWang W(2025)An Effective Node Injection Approach for Attacking Social Network AlignmentIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.351584220(589-604)Online publication date: 2025
https://doi.org/10.1109/TIFS.2024.3515842
Tang RYong ZJiang SChen XLiu YZhang YSun GWang W(2025)Network alignmentPhysics Reports10.1016/j.physrep.2024.11.0061107(1-45)Online publication date: Mar-2025
https://doi.org/10.1016/j.physrep.2024.11.006
Li YCai HLiu H(2025)GRANA: Graph Convolutional Network Based Network Representation Learning Method for Attributed Network AlignmentInformation Sciences10.1016/j.ins.2025.122014(122014)Online publication date: Feb-2025
https://doi.org/10.1016/j.ins.2025.122014
Zhou YZhang ZZhang ZLyu LKu WSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Effective federated graph matchingProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694646(62257-62293)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694646
Huang SXiang HLeng CXiao F(2024)Cross-Social-Network User Identification Based on Bidirectional GCN and MNF-UI ModelsElectronics10.3390/electronics1312235113:12(2351)Online publication date: 15-Jun-2024
https://doi.org/10.3390/electronics13122351
Xie XZang WHu YJi JXiong Z(2024)Novel Method of Edge-Removing Walk for Graph Representation in User Identity LinkageElectronics10.3390/electronics1304071513:4(715)Online publication date: 9-Feb-2024
https://doi.org/10.3390/electronics13040715
Zhang PZhou QLu TGu HGu N(2024)DeLink: An Adversarial Framework for Defending against Cross-site User Identity LinkageACM Transactions on the Web10.1145/364382818:2(1-34)Online publication date: 5-Feb-2024
https://dl.acm.org/doi/10.1145/3643828
Tommasel A(2024)Fairness Matters: A look at LLM-generated group recommendationsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688182(993-998)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3640457.3688182
Do MShin KBaeza-Yates RBonchi F(2024)Unsupervised Alignment of Hypergraphs with Different ScalesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671955(609-620)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671955
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten