skip to main content
10.1145/3078714.3078738acmconferencesArticle/Chapter ViewAbstractPublication PageshtConference Proceedingsconference-collections
short-paper

SENA: Preserving Social Structure for Network Embedding

Published:04 July 2017Publication History

ABSTRACT

Network embedding transforms a network into a continuous feature space. Network augmentation, on the other hand, leverages this feature representation to obtain a more informative network by adding potentially plausible edges while removing noisy edges. Traditional network embedding methods are often inefficient in capturing - (i) the latent relationship when the network is sparse (the network sparsity problem), and (ii) the local and global neighborhood structure of vertices (structure preserving problem).

We propose SENA, a structural embedding and network augmentation framework for social network analysis. Unlike other embedding methods which only generate vertex features, SENA generates features for both vertices and relations (edges) by minimizing a well-designed objective function composed of a loss function and a regularization. The loss function reduces the network-sparsity problem by learning from both the edges present (true edges) and absent (false edges) in the network; whereas the regularization term preserves the structural properties of the network by efficiently considering - (i) the local neighborhood of vertices and edges, and (ii) the network spectra, i.e., eigenvectors of a symmetric matrix representing the network.

We compare SENA with four baseline network embedding methods, namely DeepWalk, SE, SME and TransE. We demonstrate the efficacy of SENA through a task-based evaluation setting on different real-world networks. We consider the state-of-the-art algorithms for (i) community detection, (ii) link prediction and (iii) knowledge graph query answering, and show that with SENA's representation, these algorithms achieve up to 10%, 9% and (surprisingly) 108% higher accuracy respectively compared to the best baseline embedding methods.

References

  1. D. Babić, D. J. Klein, I. Lukovits, S. Nikolić, and N. Trinajstič. 2002. Resistance- Distance Matrix: a Computational Algorithm and its Application. International Journal of Quantum Chemistry 90, 1 (2002), 166--176.Google ScholarGoogle ScholarCross RefCross Ref
  2. Lars Backstrom and Jure Leskovec. 2011. Supervised Random Walks: Predicting and Recommending Links in Social Networks. In WSDM . Hong Kong, China, 635--644. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian J. Goodfellow, Arnaud Bergeron, Nicolas Bouchard, and Yoshua Bengio. 2012. Theano: new features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop. (2012).Google ScholarGoogle Scholar
  4. Mikhail Belkin and Partha Niyogi. 2001. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering. In NIPS . MIT Press, Granada, Spain, 585--591. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Paolo Bientinesi, Inderjit S. Dhillon, and Robert A. van de Geijn. 2005. A Parallel Eigensolver for Dense Symmetric Matrices based on Multiple Relatively Robust Representations. SIAM Journal on Scientific Computing 27 (sep 2005). Issue 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. JSTAT (2008), P10008.Google ScholarGoogle Scholar
  7. Manuel Blum, Robert W. Floyd, Vaughan Pratt, Ronald L. Rivest, and Robert E. Tarjan. 1973. Time Bounds for Selection. J. Comput. Syst. Sci. 7, 4 (Aug. 1973), 448--461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Antoine Bordes, Xavier Glorot, Jason Weston, and Yoshua Bengio. 2014. A semantic matching energy function for learning with multi-relational data - Application to word-sense disambiguation. Machine Learning 94, 2 (2014), 233--259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Antoine Bordes, Nicolas Usunier, Alberto García-Durán, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data.. In NIPS. 2787--2795. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Antoine Bordes, Jason Weston, Ronan Collobert, and Yoshua Bengio. 2011. Learning Structured Embeddings of Knowledge Bases. In AAAI. AAAI Press, San Francisco, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Matthew Brand. 2003. Continuous Nonlinear Dimensionality Reduction by Kernel Eigenmaps. In IJCAI. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 547--552. http://dl.acm.org/citation.cfm?id=1630659.1630740 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Lawrence Cayton. 2005. Algorithms for manifold learning. Univ. of California at San Diego Tech. Rep (2005), 1--17.Google ScholarGoogle Scholar
  13. Tanmoy Chakraborty, Ayushi Dalmia, Animesh Mukherjee, and Niloy Ganguly. 2016. Metrics for Community Analysis: A Survey. arXiv preprint arXiv:1604.03512 (2016).Google ScholarGoogle Scholar
  14. Tanmoy Chakraborty, Sriram Srinivasan, Niloy Ganguly, Animesh Mukherjee, and Sanjukta Bhowmick. 2014. On the Permanence of Vertices in Network Communities. In SIGKDD. New York, USA, 1396--1405. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. James W. Demmel. 1997. Applied Numerical Linear Algebra . Society for Industrial and Applied Mathematics, Philadelphia, PA, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Richard Durstenfeld. 1964. Algorithm 235: Random Permutation. Commun. ACM 7, 7 (July 1964), 420--. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Girvan and M. E. Newman. 2002. Community structure in social and biological networks. PNAS 99, 12 (June 2002), 7821--7826.Google ScholarGoogle ScholarCross RefCross Ref
  18. Samer Hassan, Rada Mihalcea, and Carmen Banea. 2007. Random-Walk Term Weighting for Improved Text Classification.. In ICSC . IEEE Computer Society, 242--249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Xiaofei He, Deng Cai, Shuicheng Yan, and Hong-Jiang Zhang. 2005. Neighborhood Preserving Embedding. In ICCV. IEEE Computer Society, Washington, DC, USA, 1208--1213. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. ML. Huang, P. Eades, and J. Wang. 1998. On-line animated visualisation of huge graphs using a modified spring algorithm. IEEE Transactions on Computers 9 (1998), 623--645. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Paul Jaccard. 1912. The Distribution of the Flora in the Alpine Zone. New Phytologist 11, 2 (Feb. 1912), 37--50. http://www.jstor.org/stable/2427226'seq=3Google ScholarGoogle ScholarCross RefCross Ref
  22. Rodolphe Jenatton, Nicolas Le Roux, Antoine Bordes, and Guillaume Obozinski. 2012. A latent factor model for highly multi-relational data. In NIPS . Lake Tahoe, Nevada, USA, 3176--3184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Lancichinetti, F. Radicchi, J. J. Ramasco, and S. Fortunato. 2011. Finding statistically significant communities in networks. PLoS ONE 6, 4 (2011), e18961.Google ScholarGoogle ScholarCross RefCross Ref
  24. David Liben-Nowell and Jon Kleinberg. 2003. The Link Prediction Problem for Social Networks. In CIKM. ACM, New York, USA, 556--559. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Ulrike Luxburg. 2007. A Tutorial on Spectral Clustering. Statistics and Computing 17, 4 (Dec. 2007), 395--416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013).Google ScholarGoogle Scholar
  27. Bojan Mohar. 1991. The Laplacian spectrum of graphs. In Graph Theory, Combi- natorics, and Applications. Wiley, 871--898.Google ScholarGoogle Scholar
  28. Bryan Perozzi, Rami Al-Rfou', and Steven Skiena. 2014. DeepWalk: online learning of social representations. In KDD . ACM, 701--710. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Pascal Pons and Matthieu Latapy. 2006. Computing Communities in Large Networks Using Random Walks. J. Graph Algorithms Appl. 10, 2 (2006), 191--218.Google ScholarGoogle ScholarCross RefCross Ref
  30. U. N. Raghavan, R. Albert, and S. Kumara. 2007. Near linear time algorithm to detect community structures in large-scale networks. Phy. Rev. E. 76, 3 (2007).Google ScholarGoogle Scholar
  31. Lorenzo Rosasco, Ernesto De Vito, Andrea Caponnetto, Michele Piana, and Alessandro Verri. 2004. Are Loss Functions All the Same? Neural Comput. 16, 5 (May 2004), 1063--1076. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Martin Rosvall and Carl T. Bergstrom. 2008. Maps of random walks on complex networks reveal community structure. PNAS 105, 4 (2008), 1118--1123.Google ScholarGoogle ScholarCross RefCross Ref
  33. S.T. Roweis and L.K. Saul. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 5500 (2000), 2323--2326.Google ScholarGoogle ScholarCross RefCross Ref
  34. Daniel A. Spielman. 2007. Spectral Graph Theory and its Applications. In FOCS . IEEE Computer Society, 29--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale Information Network Embedding. In WWW . ACM, Florence, Italy, 1067--1077. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Reynold S. Xin, Joseph E. Gonzalez, Michael J. Franklin, and Ion Stoica. 2013. GraphX: A Resilient Distributed Graph System on Spark. In First International Workshop on Graph Data Management Experiences and Systems (GRADES '13) . ACM, New York, NY, USA, Article 2, 6 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jaewon Yang and Jure Leskovec. 2013. Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach. In WSDM . ACM, New York, USA, 587--596. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SENA: Preserving Social Structure for Network Embedding

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        HT '17: Proceedings of the 28th ACM Conference on Hypertext and Social Media
        July 2017
        336 pages
        ISBN:9781450347082
        DOI:10.1145/3078714

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 July 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper

        Acceptance Rates

        HT '17 Paper Acceptance Rate19of69submissions,28%Overall Acceptance Rate378of1,158submissions,33%

        Upcoming Conference

        HT '24
        35th ACM Conference on Hypertext and Social Media
        September 10 - 13, 2024
        Poznan , Poland

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader