Skip to main content

Graph Mining Applications to Social Network Analysis

  • Chapter
  • First Online:
Managing and Mining Graph Data

Part of the book series: Advances in Database Systems ((ADBS,volume 40))

Abstract

The prosperity of Web 2.0 and social media brings about many diverse social networks of unprecedented scales, which present new challenges for more effec- tive graph-mining techniques. In this chapter, we present some graph patterns that are commonly observed in large-scale social networks. As most networks demonstrate strong community structures, one basic task in social network anal- ysis is community detection which uncovers the group membership of actors in a network. We categorize and survey representative graph mining approaches and evaluation strategies for community detection. We then present and discuss some research issues for future exploration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. Abello, M. G. C. Resende, and S. Sudarsky. Massive quasi-clique detection. In LATIN, pages 598–612, 2002.

    Google Scholar 

  2. A. Abou-Rjeili and G. Karypis. Multilevel algorithms for partitioning power-law graphs. pages 10 pp.–, April 2006.

    Google Scholar 

  3. N. Agarwal, H. Liu, L. Tang, and P. S. Yu. Identifying the influential bloggers in a community. In WSDM ’08: Proceedings of the international conference on Web search and web data mining, pages 207–218, New York, NY, USA, 2008. ACM.

    Chapter  Google Scholar 

  4. E. Airodi, D. Blei, S. Fienberg, and E. P. Xing. Mixed membership stochastic blockmodels. J. Mach. Learn. Res., 9:1981–2014, 2008.

    Google Scholar 

  5. N. Alon, R. Yuster, and U. Zwick. Finding and counting given length cycles. Algorithmica, 17(3):209–223, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  6. C. Anderson. The Long Tail: why the future of business is selling less of more. 2006.

    Google Scholar 

  7. L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership, growth, and evolution. In KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 44–54, New York, NY, USA, 2006. ACM.

    Chapter  Google Scholar 

  8. A.-L. Barabasi and R. Albert. Emergence of Scaling in Random Networks. Science, 286(5439):509–512, 1999.

    Article  MathSciNet  Google Scholar 

  9. L. Becchetti, P. Boldi, C. Castillo, and A. Gionis. Efficient semi-streaming algorithms for local triangle counting in massive graphs. In KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 16–24, New York, NY, USA, 2008. ACM.

    Chapter  Google Scholar 

  10. S. P. Borgatti, M. G. Everett, and P. R. Shirey. Ls sets, lambda sets and other cohesive subsets. Social Networks, 12:337–357, 1990.

    Article  MathSciNet  Google Scholar 

  11. U. Brandes, D. Delling, M. Gaertler, R. Gorke, M. Hoefer, Z. Nikoloski, and D. Wagner. Maximizing modularity is hard. Arxiv preprint physics/0608255, 2006.

    Google Scholar 

  12. T. Bu and D. Towsley. On distinguishing between internet power law topology generators. In Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies, volume 2, pages 638–647 vol.2, 2002.

    Google Scholar 

  13. L. S. Buriol, G. Frahling, S. Leonardi, A. Marchetti-Spaccamela, and C. Sohler. Counting triangles in data streams. In PODS ’06: Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 253–262, New York, NY, USA, 2006. ACM.

    Chapter  Google Scholar 

  14. D. Chakrabarti and C. Faloutsos. Graph mining: Laws, generators, and algorithms. ACM Comput. Surv., 38(1):2, 2006.

    Article  Google Scholar 

  15. A. Clauset, M. Mewman, and C. Moore. Finding community structure in very large networks. Arxiv preprint cond-mat/0408187, 2004.

    Google Scholar 

  16. A. Clauset, C. Moore, and M. E. J. Newman. Hierarchical structure and the prediction of missing links in networks. Nature, 453:98–101, 2008.

    Article  Google Scholar 

  17. A. Clauset, C. R. Shalizi, and M. E. J. Newman. Power-law distributions in empirical data. arXiv, 706, 2007.

    Google Scholar 

  18. J. Diesner, T. L. Frantz, and K. M. Carley. Communication networks from the enron email corpus “it’s always about the people. enron is no. different”. Comput. Math. Organ. Theory, 11(3):201–228, 2005.

    Article  MATH  Google Scholar 

  19. Y. Dourisboure, F. Geraci, and M. Pellegrini. Extraction and classification of dense communities in the web. In WWW ’07: Proceedings of the 16th international conference on World Wide Web, pages 461–470, New York, NY, USA, 2007. ACM.

    Chapter  Google Scholar 

  20. P. Erdos and A. Renyi. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci, 5:17–61, 1960.

    MathSciNet  Google Scholar 

  21. M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. In SIGCOMM ’99: Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication, pages 251–262, New York, NY, USA, 1999. ACM.

    Chapter  Google Scholar 

  22. G. W. Flake, S. Lawrence, and C. L. Giles. Efficient identification of web communities. In KDD ’00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 150–160, New York, NY, USA, 2000. ACM.

    Chapter  Google Scholar 

  23. D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs. In VLDB ’05: Proceedings of the 31st international conference on Very large data bases, pages 721–732. VLDB Endowment, 2005.

    Google Scholar 

  24. M. S. Handcock, A. E. Raftery, and J. M. Tantrum. Model-based clustering for social networks. Journal Of The Royal Statistical Society Series A, 127(2):301–354, 2007.

    MathSciNet  Google Scholar 

  25. R. Hanneman and M. Riddle. Introduction to Social Network Methods. http://faculty.ucr.edu/hanneman/, 2005.

  26. P. D. Hoff and M. S. H. Adrian E. Raftery. Latent space approaches to social network analysis. Journal of the American Statistical Association, 97(460):1090–1098, 2002.

    Article  MATH  MathSciNet  Google Scholar 

  27. J. Hopcroft, O. Khan, B. Kulis, and B. Selman. Natural communities in large linked networks. In KDD ’03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 541–546, New York, NY, USA, 2003. ACM.

    Chapter  Google Scholar 

  28. R. Kumar, J. Novak, and A. Tomkins. Structure and evolution of online social networks. In KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 611–617, New York, NY, USA, 2006. ACM.

    Chapter  Google Scholar 

  29. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the web for emerging cyber-communities. Comput. Netw., 31(11–16):1481–1493, 1999.

    Article  Google Scholar 

  30. M. Latapy. Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor. Comput. Sci., 407(1–3):458–473, 2008.

    Article  MATH  MathSciNet  Google Scholar 

  31. J. Leskovec, L. A. Adamic, and B. A. Huberman. The dynamics of viral marketing. In EC ’06: Proceedings of the 7th ACM conference on Electronic commerce, pages 228–237, New York, NY, USA, 2006. ACM.

    Chapter  Google Scholar 

  32. J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins. Microscopic evolution of social networks. In KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 462–470, New York, NY, USA, 2008. ACM.

    Chapter  Google Scholar 

  33. J. Leskovec and E. Horvitz. Planetary-scale views on a large instant-messaging network. In WWW ’08: Proceeding of the 17th international conference on World Wide Web, pages 915–924, New York, NY, USA, 2008. ACM.

    Chapter  Google Scholar 

  34. J. Leskovec, J. Kleinberg, and C. Faloutsos. Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Discov. Data, 1(1):2, 2007.

    Article  Google Scholar 

  35. J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Statistical properties of community structure in large social and information networks. In WWW ’08: Proceeding of the 17th international conference on World Wide Web, pages 695–704, New York, NY, USA, 2008. ACM.

    Chapter  Google Scholar 

  36. J. Leskovec, M. McGlohon, C. Faloutsos, N. Glance, and M. Hurst. Cascading behavior in large blog graphs. In SIAM International Conference on Data Mining (SDM 2007), 2007.

    Google Scholar 

  37. B. McClosky and I. V. Hicks. Detecting cohesive groups. http://www.caam.rice.edu/ivhicks/CokplexAlgorithmPaper.pdf, 2009.

  38. A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. Measurement and analysis of online social networks. In IMC ’07: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, pages 29–42, New York, NY, USA, 2007. ACM.

    Chapter  Google Scholar 

  39. A. A. Nanavati, S. Gurumurthy, G. Das, D. Chakraborty, K. Dasgupta, S. Mukherjea, and A. Joshi. On the structural properties of massive telecom call graphs: findings and implications. In CIKM ’06: Proceedings of the 15th ACM international conference on Information and knowledge management, pages 435–444, New York, NY, USA, 2006. ACM.

    Chapter  Google Scholar 

  40. M. Newman. The structure and function of complex networks. SIAM Review, 45:167–256, 2003.

    Article  MATH  MathSciNet  Google Scholar 

  41. M. Newman. Power laws, Pareto distributions and Zipf’s law. Contemporary physics, 46(5):323–352, 2005.

    Article  Google Scholar 

  42. M. Newman. Finding community structure in networks using the eigen-vectors of matrices. Physical Review E (Statistical, Nonlinear, and Soft Matter Physics), 74(3), 2006.

    Google Scholar 

  43. M. Newman. Modularity and community structure in networks. PNAS, 103(23):8577–8582, 2006.

    Article  Google Scholar 

  44. M. Newman, A.-L. Barabasi, and D. J. Watts, editors. The Structure and Dynamics of Networks. 2006.

    Google Scholar 

  45. M. Newman and M. Girvan. Finding and evaluating community structure in networks. Physical Review E, 69:026113, 2004.

    Article  Google Scholar 

  46. K. Nowicki and T. A. B. Snijders. Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, 96(455):1077–1087, 2001.

    Article  MATH  MathSciNet  Google Scholar 

  47. G. Palla, I. Derenyi, I. Farkas, and T. Vicsek. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435:814–818, 2005.

    Article  Google Scholar 

  48. C. R. Palmer, P. B. Gibbons, and C. Faloutsos. ANF: a fast and scalable tool for data mining in massive graphs. In KDD ’02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 81–90, New York, NY, USA, 2002. ACM.

    Chapter  Google Scholar 

  49. S. Papadopoulos, A. Skusa, A. Vakali, Y. Kompatsiaris, and N. Wagner. Bridge bounding: A local approach for efficient community discovery in complex networks. Feb 2009.

    Google Scholar 

  50. P. Sarkar and A. W. Moore. Dynamic social network analysis using latent space models. SIGKDD Explor. Newsl., 7(2):31–40, 2005.

    Article  Google Scholar 

  51. T. Schank and D. Wagner. Finding, counting and listing all triangles in large graphs, an experimental study. In Workshop on Experimental and Efficient Algorithms, 2005.

    Google Scholar 

  52. A. Strehl and J. Ghosh. Cluster ensembles — a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res., 3:583–617, 2003.

    Article  MATH  MathSciNet  Google Scholar 

  53. L. Tang and H. Liu. Relational learning via latent social dimensions. In KDD ’09: Proceeding of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 2009.

    Google Scholar 

  54. L. Tang and H. Liu. Uncovering cross-dimension group structures in multi-dimensional networks. In SDM workshop on Analysis of Dynamic Networks, 2009.

    Google Scholar 

  55. L. Tang, H. Liu, J. Zhang, N. Agarwal, and J. J. Salerno. Topic taxonomy adaptation for group profiling. ACM Trans. Knowl. Discov. Data, 1(4):1–28, 2008.

    Article  Google Scholar 

  56. L. Tang, H. Liu, J. Zhang, and Z. Nazeri. Community evolution in dynamic multi-mode networks. In KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 677–685, New York, NY, USA, 2008. ACM.

    Chapter  Google Scholar 

  57. S. Tauro, C. Palmer, G. Siganos, and M. Faloutsos. A simple conceptual model for the internet topology. In Global Telecommunications Conference, volume 3, pages 1667–1671, 2001.

    Google Scholar 

  58. J. Travers and S. Milgram. An experimental study of the small world problem. Sociometry, 32(4):425–443, 1969.

    Article  Google Scholar 

  59. C. E. Tsourakakis. Fast counting of triangles in large real networks without counting: Algorithms and laws. IEEE International Conference on Data Mining, 0:608–617, 2008.

    Article  Google Scholar 

  60. K. Wakita and T. Tsurumi. Finding community structure in mega-scale social networks: [extended abstract]. In WWW ’07: Proceedings of the 16th international conference on World Wide Web, pages 1275–1276, New York, NY, USA, 2007. ACM.

    Chapter  Google Scholar 

  61. S. Wasserman and K. Faust. Social Network Analysis: Methods and Applications. Cambridge University Press, 1994.

    Google Scholar 

  62. D. J. Watts and S. H. Strogatz. Collective dynamics of ‘small-world’ networks. Nature, 393:440–442, 1998.

    Article  Google Scholar 

  63. K. Yu, S. Yu, and V. Tresp. Soft clsutering on graphs. In NIPS, 2005.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Tang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag US

About this chapter

Cite this chapter

Tang, L., Liu, H. (2010). Graph Mining Applications to Social Network Analysis. In: Aggarwal, C., Wang, H. (eds) Managing and Mining Graph Data. Advances in Database Systems, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-6045-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-6045-0_16

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-6044-3

  • Online ISBN: 978-1-4419-6045-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics