skip to main content
10.1145/2783258.2783268acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

COSNET: Connecting Heterogeneous Social Networks with Local and Global Consistency

Published:10 August 2015Publication History

ABSTRACT

More often than not, people are active in more than one social network. Identifying users from multiple heterogeneous social networks and integrating the different networks is a fundamental issue in many applications. The existing methods tackle this problem by estimating pairwise similarity between users in two networks. However, those methods suffer from potential inconsistency of matchings between multiple networks.

In this paper, we propose COSNET (COnnecting heterogeneous Social NETworks with local and global consistency), a novel energy-based model, to address this problem by considering both local and global consistency among multiple networks. An efficient subgradient algorithm is developed to train the model by converting the original energy-based objective function into its dual form.

We evaluate the proposed model on two different genres of data collections: SNS and Academia, each consisting of multiple heterogeneous social networks. Our experimental results validate the effectiveness and efficiency of the proposed model. On both data collections, the proposed COSNET method significantly outperforms several alternative methods by up to 10-30% (p << 0:001, t-test) in terms of F1-score. We also demonstrate that applying the integration results produced by our method can improve the accuracy of expert finding, an important task in social networks.

Skip Supplemental Material Section

Supplemental Material

p1485.mp4

mp4

188.9 MB

References

  1. R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network Flows: Theory, Algorithms, and Applications. Prentice Hall, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Backstrom, C. Dwork, and J. M. Kleinberg. Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In WWW'07, pages 181--190, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. X. Bai, F. P. Junqueira, and S. H. Sengamedu. Exploiting user clicks for automatic seed set generation for entity matching. In KDD'13, pages 980--988, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Bellare, S. Iyengar, A. G. Parameswaran, and V. Rastogi. Active sampling for entity matching. In KDD'12, pages 1131--1139, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. I. Bhattacharya and L. Getoor. Collective entity resolution in relational data. ACM Transactions on Knowledge Discovery from Data, 1(1):1--36, March 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In SIGIR'2004, pages 25--32, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. W. Chen, Z. Liu, X. Sun, and Y. Wang. A game-theoretic framework to identify overlapping communities in social networks. Data Mining and Knowledge Discovery, 21(2):224--240, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. W. W. Cohen, P. Ravikumar, and S. E. Fienberg. A comparison of string metrics for matching names and records. In Proceedings of the IJCAI-2003 Workshop on Information Integration on the Web, pages 73--78, 2003.Google ScholarGoogle Scholar
  9. S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In EMNLP-CoNLL'07, volume 6, pages 708--716, 2007.Google ScholarGoogle Scholar
  10. Y. Cui, J. Pei, G. Tang, W.-S. Luk, D. Jiang, and M. Hua. Finding email correspondents in online social networks. World Wide Web, 16(2):195--218, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. MIT Press, Cambridge, MA, 2000.Google ScholarGoogle Scholar
  12. S. Kataria, K. S. Kumar, R. Rastogi, P. Sen, and S. H. Sengamedu. Entity disambiguation with hierarchical topic models. In KDD'11, pages 1037--1045, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Komodakis. Efficient training for pairwise or higher order crfs via dual decomposition. In CVPR'11, pages 1841--1848, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. N. Komodakis, N. Paragios, and G. Tziritas. Mrf energy minimization and beyond via dual decomposition. IEEE Trans. Pattern Anal. Mach. Intell., 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. X. Kong, J. Zhang, and S. Y. Philip. Inferring anchor links across multiple heterogeneous social networks. In CIKM'13, pages 179--188, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Kwak, C. Lee, H. Park, and S. B. Moon. What is twitter, a social network or a news media? In WWW'10, pages 591--600, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Lacoste-Julien, K. Palla, A. Davies, G. Kasneci, T. Graepel, and Z. Ghahramani. Sigma: Simple greedy matching for aligning large knowledge bases. In KDD'13, pages 572--580, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. LeCun, S. Chopra, and R. Hadsell. A tutorial on energy-based learning. 2006 CIAR Summer School: Neural Computation & Adaptive Perception, 2006.Google ScholarGoogle Scholar
  19. J. Li, J. Tang, Y. Li, and Q. Luo. Rimom: A dynamic multi-strategy ontology alignment framework. IEEE TKDE, 21(8):1218--1232, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Li, C. Wang, F. Han, J. Han, D. Roth, and X. Yan. Mining evidences for named entity disambiguation. In KDD'13, pages 1070--1078, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Liu, F. Zhang, X. Song, Y.-I. Song, C.-Y. Lin, and H.-W. Hon. What's in a name?: an unsupervised approach to link users across communities. In WSDM'13, pages 495--504, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Liu, S. Wang, F. Zhu, J. Zhang, and R. Krishnan. Hydra: Large-scale social identity linkage via heterogeneous behavior modeling. In SIGMOD'14, pages 51--62, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed graphlab: a framework for machine learning and data mining in the cloud. VLDB'12, 5(8):716--727, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. H. Ma, H. Yang, M. R. Lyu, and I. King. Sorec: social recommendation using probabilistic matrix factorization. In CIKM'08, pages 931--940, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Maslow. A theory of human motivation. Psychological Review, 50(4):370--396, 1943.Google ScholarGoogle ScholarCross RefCross Ref
  26. A. Narayanan and V. Shmatikov. De-anonymizing social networks. In IEEE Symposium on Security and Privacy'09, pages 173--187, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Perito, C. Castelluccia, M. A. Kaafar, and P. Manils. How unique and traceable are usernames? In Privacy Enhancing Technologies, pages 1--17, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. W. Shen, J. Wang, P. Luo, and M. Wang. Linking named entities in tweets with knowledge base via user interest modeling. In KDD'13, pages 68--76, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Tang, A. Fong, B. Wang, and J. Zhang. A unified probabilistic framework for name disambiguation in digital library. IEEE TKDE, 24(6):975--987, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Tang, H. Gao, H. Liu, and A. D. Sarma. eTrust: Understanding trust evolution in an online world. In KDD'12, pages 253--261, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: Extraction and mining of academic social networks. In KDD'08, pages 990--998, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. W. Tang, J. Tang, T. Lei, C. Tan, B. Gao, and T. Li. On optimization of expertise matching with various constraints. Neurocomputing, 76(1):71--83, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. B. Taskar, C. Guestrin, and D. Koller. Max-margin markov networks. NIPS'04, 16, 2004.Google ScholarGoogle Scholar
  34. H. Whitney. Congruent graphs and the connectivity of graphs. American Journal of Mathematics, 54(1):150--168, 1932.Google ScholarGoogle ScholarCross RefCross Ref
  35. S. Wu, J. M. Hofman, W. A. Mason, and D. J. Watts. Who says what to whom on twitter. In WWW'11, pages 705--714, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. L. Yartseva and M. Grossglauser. On the performance of percolation graph matching. In COSN'13, pages 119--130, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. R. Zafarani and H. Liu. Connecting corresponding identities across communities. In ICWSM'09, pages 354--357, 2009.Google ScholarGoogle Scholar
  38. R. Zafarani and H. Liu. Connecting users across social media sites: A behavioral-modeling approach. In KDD'13, pages 41--49, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. J. Zhang, J. Tang, and J. Li. Expert finding in a social network. In DASFAA'07, pages 1066--1069, 2007.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. COSNET: Connecting Heterogeneous Social Networks with Local and Global Consistency

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
      August 2015
      2378 pages
      ISBN:9781450336642
      DOI:10.1145/2783258

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 August 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      KDD '15 Paper Acceptance Rate160of819submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader