Skip to main content
Log in

Generation of a Social Network Graph by Using Apache Spark

  • Published:
Automatic Control and Computer Sciences Aims and scope Submit manuscript

Abstract

It is planned to create a method of clustering a social network graph. To test the method, it is necessary to generate a graph similar in structure to existing social networks. The article presents an algorithm for the graph-distributed generation. We take into account basic properties such as the power-law distribution of the number of user communities, the dense intersections of social networks, and others. This algorithm also considers the problems that are present in similar works of other authors, for example, the multiple edges problem in the generation process. A special feature of the created algorithm is the implementation depending on the number of communities, rather than on the number of connected users, as is done in other works. This is connected with a peculiarity of the development of the existing social network structure. The properties of its graph are described in the paper. We describe a Table 1 containing the variables needed for the algorithm. A step-by-step generation algorithm is compiled. Appropriate mathematical parameters are calculated for it. The generation is performed in a distributed way by the Apache Spark framework. It is described in detail how the division of tasks with the help of this framework operates. The Erdos–Renyi model for random graphs is used in the algorithm. It is the most suitable and easiest one to implement. The main advantages of the created method are the small amount of resources and faster execution speed in comparison with other similar generators. Speed is achieved through distributed work and the fact that at any time, the network users have their own unique numbers and are ordered by these numbers so that there is no need to sort them out. The designed algorithm will not only promote the creation of an efficient clustering method, but can also be useful in other development areas connected, for example, with social network search engines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Chikhradze, K.K., et al., On a model of social network with user communities for distributed generation of random social graphs, Mach. Learn. Data Anal., 2014, vol. 1, no. 8, pp. 1027–1047.

    Google Scholar 

  2. Yang, J. and Leskovec, J., Community-affiliation graph model for overlapping network community detection, IEEE 12th Conference (International) on Data Mining, 2012.

    Google Scholar 

  3. Raigorodskii, A., Models of random graphs and their applications to the web-graphs analysis, RUSSIR-2015, Moscow, 2015.

    Google Scholar 

  4. Erdos, P. and Renyi, A., On the evolution of random graphs, Bull. Inst. Int. Stat. Tokyo, 1961, vol. 38, pp. 343–347.

    MathSciNet  MATH  Google Scholar 

  5. Aiello, W., Chung, F., and Lu, L., On the evolution of random graphs. A random graph model for power law graphs. http://people.math.sc.edu/lu/papers/power.pdf.

  6. Karau, K., et al., Learning Spark: Lightning-fast Data Analysis, DMK Press, Moscow, 2015.

    Google Scholar 

  7. Vovchok, S.I., Creation of the method for clustering the social media graph, Novye informatsionnye tekhnologii v nauke: Sbornik statey Mezhdunarodnoy nauchno-prakticheskoy konferentsii MTsII OMEGA SAYNS (New Information Technologies: Proc. Int. Sci.-Pract. Conf. MTsII OMEGA SAYNS), 2016, part 2, pp. 34–36.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Y. A. Belov.

Additional information

Published in Russian in Modelirovanie i Analiz Informatsionnykh Sistem, 2016, Vol. 23, No. 6, pp. 777–783.

The article was translated by the authors.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Belov, Y.A., Vovchok, S.I. Generation of a Social Network Graph by Using Apache Spark. Aut. Control Comp. Sci. 51, 678–681 (2017). https://doi.org/10.3103/S0146411617070264

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S0146411617070264

Keywords

Navigation