Abstract
Social Network Analysis (SNA) is an established discipline for the study of groups of individuals with applications in several areas, like economics, information science, organizational studies and psychology. In the last fifteen years the exponential growth of online Social Network Sites (SNSs) , like Facebook, QQ and Twitter has provided a new challenging application context for SNA methods. However, with respect to traditional SNA application domains these systems are characterized by very large volumes of data, and this has recently led to the development of parallel network analysis algorithms and libraries. In this chapter we provide an overview of the state of the art in the field of large scale social network analysis; in particular, we focus on parallel algorithms and libraries for the computation of network centrality metrics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
Preferential attachment means that the probability that a new node A will be connected to an already existing node B is proportional to the number of edges that B already has.
- 7.
- 8.
- 9.
- 10.
References
Anderson, W., Briggs, P., Hellberg, C.S., Hess, D.W., Khokhlov, A., Lanzagorta, M., Rosenberg, R.: Early experience with scientific programs on the cray MTA-2. In: Proceedings of 2003 ACM/IEEE Conference on Supercomputing, SC’03, Phoenix, p. 46. ACM, New York, (2003). doi:10.1145/1048935.1050196
Aragon, C.R., GSeidel, R.: Randomized search trees. In: Annual IEEE Symposium on Foundations of Computer Science, Research Triangle Park. IEEE Computer Society, Los Alamitos, pp 540–545 (1989). doi:10.1109/SFCS.1989.63531
Asanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., Wessel, D., Yelick, K.: A view of the parallel computing landscape. Commun ACM 52, 56–67 (2009)
Bader, D.A., Madduri, K.: Designing multithreaded algorithms for breadth-first search and st-connectivity on the Cray MTA-2. In: Proceedings of International Conference on Parallel Processing, Columbus. IEEE Computer Society, Los Alamitos, pp 523–530 (2006). doi:10.1109/ICPP.2006.34
Bader, D.A., Madduri, K.: Parallel algorithms for evaluating centrality indices in real-world networks. In: Proceedings of 2006 International Conference on Parallel Processing, ICPP’06, Columbus, pp. 539–550. IEEE Computer Society, Washington, DC (2006). doi:10.1109/ICPP.2006.57
Bader, D.A., Madduri, K.: SNAP, small-world network analysis and partitioning: an open-source parallel graph framework for the exploration of large-scale networks. In: Proceedings of International Symposium on Parallel and Distributed Processing, IPDPS, Miami, pp. 1–12 (2008). doi:10.1109/IPDPS.2008.4536261
Bal, H.E., Maassen, J., van Nieuwpoort, R.V., Drost, N., Kemp, R., Palmer, N., Wrzesinska, G., Kielmann, T., Seinstra, F., Jacobs, C.: Real-world distributed computing with Ibis. Computer 43, 54–62 (2010). doi:10.1109/MC.2010.184
Barabasi, A.L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 11 (1999)
Barrett, B.W., Berry, J.W., Murphy, R.C., Wheeler, K.B.: Implementing a portable multi-threaded graph library: the MTGL on Qthreads. In: IEEE International Symposium on Parallel & Distributed Processing, IPDPS, Rome, pp. 1–8 (2009). doi:10.1109/IPDPS.2009.5161102
Berry, J.W., Hendrickson, B., Kahan, S., Konecny, P.: Graph software development and performance on the MTA-2 and Eldorado. In: 48th Cray Users Group Meeting, Lugano (2006)
Boost: Boost C++ Libraries. Available at http://www.boost.org/ (2011)
Borkar, S.: Design challenges of technology scaling. IEEE Micro 19(4), 23–29 (1999)
Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25, 163–177 (2001)
Buluç, A., Gilbert, J.R.: The combinatorial BLAS: design, implementation, and applications. Int. J. High Perform. Comput. Appl. 25, 496–509 (2011). doi:10.1177/1094342011403516
Celli, F., Di Lascio, F., Magnani, M., Pacelli, B., Rossi, L.: Social network data and practices: the case of friendfeed. In: Chai, S.K., Salerno, J., Mabry, P. (eds.) Advances in Social Computing. LNCS, vol. 6007, pp 346–353. Springer, Berlin/Heidelberg (2010). doi:10.1007/978-3-642-12079-4_43
Combinatorial BLAS: Combinatorial BLAS Library (MPI reference implementation). Version 1.1, Available at http://gauss.cs.ucsb.edu/~aydin/CombBLAS/html/index.html (2011)
Culler, D., Singh, K.P., Gupta, A.: Parallel Computer Architecture – A Hardware/Software Approach. Morgan Kaufmann, San Francisco (1998)
Dean, J., Ghemawat, S.: Mapreduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010). doi:10.1145/1629175.1629198
DisNet: DisNet, A Framework for Distributed Graph Computation. Available at http://nd.edu/~dial/software.html (2011)
Du, N., Wang, H., Faloutsos, C.: Analysis of large multi-modal social networks: patterns and a generator. In: Proceedings of the 2010 European conference on Machine Learning and Knowledge Discovery in Databases: Part I, ECML PKDD’10, Barcelona, pp. 393–408. Springer, Berlin/Heidelberg, (2010). http://portal.acm.org/citation.cfm?id=1888258.1888291
Edmonds, N., Hoefler, T., Lumsdaine, A.: A space-efficient parallel algorithm for computing betweenness centrality in distributed memory. In: Proceedings of International Conference on High Performance Computing (HiPC), Dona Paula, pp. 1–10 (IEEE, 2010). doi:10.1109/HIPC.2010.5713180
Erdős, P., Rényi, A.: On random graphs I. Publ Math Debrecen 6, 290–297, 156 (1959)
Evans, B.M., Chi, E.H.: Towards a model of understanding social search. In: Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work, CSCW ’08, San Diego. ACM, New York, pp. 485–494 (2008). doi:10.1145/1460563.1460641
Feo, J., Harper, D., Kahan, S., Konecny, P.: Eldorado. In: Proceedings of 2nd Conference on Computing Frontiers, CF ’05, Ischia. ACM, New York, pp. 28–34 (2005)
Foster, I.: Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Addison-Wesley Longman, Boston (1995)
Freeman, L.C.: Centrality in social networks: a conceptual clarification. Soc. Netw. 1(3), 215–239 (1978–1979)
Gregor, D., Lumsdaine, A.: The Parallel BGL: A generic library for distributed graph computations. In: Parallel Object-Oriented Scientific Computing, POOSC, Glasgow (2005)
Hadoop.: Apache hadoop. Available at http://hadoop.apache.org/ (2011)
HipG.: HipG: High-level distributed processing of large-scale graphs. Available at http://www.cs.vu.nl/~ekr/hipg/ (2011)
Kang, U., Tsourakakis, C.E., Faloutsos, C.: PEGASUS: mining peta-scale graphs. Knowl. Inf. Syst. 27(2), 303–325 (2011). doi:10.1007/s10115-010-0305-0
Krepska, E., Kielmann, T., Fokkink, W., Bal, H.: A high-level framework for distributed processing of large-scale graphs. In: Proceedings of the 12th International Conference on Distributed Computing and Networking, ICDCN’11, Bangalore, pp. 155–166. Springer, Berlin/Heidelberg (2011)
Kumar, V., Gupta, A.G.A., Karpis, G.: Introduction to Parallel Computing, 2nd edn. Addison Wesley, Harlow (2003)
Lawson, C.L., Hanson, R.J., Kincaid, D.R., Krogh, F.T.: Basic linear algebra subprograms for fortran usage. ACM Trans Math Softw 5, 308–323 (1979). doi:10.1145/355841.355847
Lichtenwalter, R.N., Chawla, N.V.: DisNet: A framework for distributed graph computation. In: Proceedings 2011 International Conference on Social Networks Analysis and Mining (ASONAM), Kaohsiung (2011, to appear)
Lumsdaine, A., Gregor, D., Hendrickson, B., Berry, J.W.: Challenges in parallel graph processing. Parallel Process. Lett. 17(1), 5–20 (2007)
Madduri, K., Bader, D.A.: Compact graph representations and parallel connectivity algorithms for massive dynamic network analysis. In: Proceedings of International Parallel and Distributed Processing Symposium, IPDPS, Rome. IEEE Computer Society, Los Alamitos, pp. 1–11 (2009)
Madduri, K., Bader, D.A.: Small-world Network Analysis and Partitioning–Version 0.4. Available at http://snap-graph.sourceforge.net/ (2010)
Magnani, M., Rossi, L.: The ml-model for multi layer network analysis. In: IEEE International Conference on Advances in Social Network Analysis and Mining, Kaohsiung (2011)
Magnani, M., Rossi, L., Montesi, D.: Information propagation analysis in a social network site. In: 2010 International Conference on Advances in Social Networks Analysis and Mining, Odense, pp. 296–300. IEEE Computer Society, Los Alamitos (2010)
Message Passing Interface Forum MPI: A Message-Passing Interface Standard–Version 2.2. Available at http://www.mpi-forum.org/docs/ (2009)
Moore, G.E.: Cramming more components onto integrated circuits. Proc. IEEE 86(1), 82 (1998). doi:10.1109/JPROC.1998.658762
Moreno, J.L., Jennings, H.H.: Who Shall Survive?: A New Approach to the Problem of Human Interrelations. Nervous and Mental Disease Publishing Co., Washington, D.C. (1934)
OpenMP Architecture Review Board: OpenMP Application Program Interface–Version 3.1. Available at http://openmp.org/wp/ (2011)
Opsahl, T., Agneessens, F., Skvoretz, J.: Node centrality in weighted networks: Generalizing degree and shortest paths. Soc. Netw. 32(3), 245–251 (2010)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)
Pegasus: Project Pegasus. Available at http://www.cs.cmu.edu/~pegasus/ (2011)
Sandia National Laboratories: Multi-Threaded Graph Library–Version 1.0. Available at https://software.sandia.gov/trac/mtgl (2011)
Siek, J., Lee, L.Q., Lumsdaine, A.: The Boost Graph Library: User Guide and Reference Manual. Addison-Wesley, Boston (2002)
Trobec, R., Vajteršic, M., Zinterhof, P. (eds.): Parallel Computing: Numerics, Applications, and Trends. Springer, Dordrecht/New York (2009). doi:10.1007/978-1-84882-409-6_1
Watts, D.J., Strogatz, S.H.: Collective dynamics of “small-world” networks. Nature 393(6684), 440–442 (1998)
Weng, J., Lim, E.P., Jiang, J., He, Q.: Twitterrank: finding topic-sensitive influential twitterers. In: Proceedings of Third ACM International Conference on Web Search and Data Mining, WSDM ’10, New York, pp. 261–270. ACM, New York (2010). doi:10.1145/1718487.1718520
Wheeler, K.B., Murphy, R.C., Thain, D.: Qthreads: an api for programming with millions of lightweight threads. In: 22nd IEEE International Symposium on Parallel and Distributed Processing, IPDPS. IEEE, Miami, pp. 1–8 (2008). doi:10.1109/IPDPS.2008.4536359
White, D., Borgatti, S.: Betweenness centrality measures for directed graphs. Soc. Netw. 16(4), 335–346 (1994). doi:10.1016/0378-8733(94)90015-9
Acknowledgements
This work has been partially funded by PRIN project “Relazioni sociali e identità in rete: vissuti e narrazioni degli italiani nei siti di social network” and by FIRB project “Information monitoring, propagation analysis and community detection in Social Network Sites”. This work was done while M. Magnani and C. Paolino were with the Deptartment of Computer Science, University of Bologna.
The authors thank the CINECA supercomputing center for providing access to the IBM pSeries 575 used for part of the tests described in Sect. 6.6.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this chapter
Cite this chapter
Lambertini, M., Magnani, M., Marzolla, M., Montesi, D., Paolino, C. (2014). Large-Scale Social Network Analysis. In: Gkoulalas-Divanis, A., Labbi, A. (eds) Large-Scale Data Analytics. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-9242-9_6
Download citation
DOI: https://doi.org/10.1007/978-1-4614-9242-9_6
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-9241-2
Online ISBN: 978-1-4614-9242-9
eBook Packages: Computer ScienceComputer Science (R0)