Skip to main content
Log in

General graph generators: experiments, analyses, and improvements

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

This article has been updated

Abstract

Graph simulation is one of the most fundamental problems in graph processing and analytics. It can help users to generate new graphs on different scales to mimic observed real-life graphs in many applications such as social networks, biology networks, and information technology. In this paper, we focus on one of the most important types of graph generators: general graph generators, which aim to reproduce the properties of the observed graphs regardless of the domains. Though a variety of graph generators have been proposed in the literature, there are still several important research gaps in this area. In this paper, we first give an overview of the existing general graph generators, including recently emerged deep learning-based approaches. We classify them into four categories: simple model-based generators, complex model-based generators, autoencoder-based generators, and GAN-based generators. Then we conduct a comprehensive experimental evaluation of 20 representative graph generators based on 17 evaluation metrics and 12 real-life graphs. We provide a general roadmap of recommendations for how to select general graph generators under different settings. Furthermore, we propose a new method that can achieve a good trade-off between simulation quality and efficiency. To help researchers and practitioners apply general graph generators in their applications or make a comprehensive evaluation of their proposed general graph generators, we also implement an end-to-end platform that is publicly available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Change history

  • 22 November 2021

    The original version was revised due to update in sixth author affiliation.

Notes

  1. https://github.com/xiangsheng1325/GraphGenerator

References

  1. Airoldi, E.M., Blei, D.M., Fienberg, S.E., et al.: Mixed membership stochastic blockmodels[J]. JMLR 1(9):1981–2014 (2008)

  2. Akoglu, L., Faloutsos, C.: RTG: a recursive realistic graph generator using random typing[J]. Data Min. Knowl. Discov. 19(2):194–209 (2009)

  3. Albert, R., Barabási, A.-L.: Statistical mechanics of complex networks. Reviews of modern physics, page 47, 2002

  4. Bacciu, D., Micheli, A., Podda, M.: Graph generation by sequential edge prediction. ESANN (2019)

  5. Bacciu, D., Micheli, A., Podda, M.: Edge-based sequential graph generation with recurrent neural networks[J]. Neurocomputing 4(16):177–189 (2020)

  6. Bagan, G., Bonifati, A., Ciucanu, R., Fletcher, G.H.L., Lemay, A., Advokaat, N.: gmark,: Schema-driven generation of graphs and queries. IEEE TKDE 856–869,(2017)

  7. Barrett, C. L., Beckman, R. J., Khan, M., Kumar, V. S. A., Marathe, M. V., Stretz, P. E. , Dutta, T., Lewis, B. L.: Generation and analysis of large synthetic social contact networks. In WSC, pages 1003–1014. IEEE, 2009

  8. Batagelj, V., Brandes, U.: Efficient generation of large random networks. Phys. Rev. E, page 036113, 2005

  9. Blondel, V. D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech.: Theory and Experiment, page P10008, 2008

  10. Bojchevski, A., Shchur, O., Zügner, D., Günnemann, S.: Netgan: Generating graphs via random walks. In: ICML, pp. 610–619, 2018

  11. Bonifati, A., Holubová, I., Prat-Pérez, A., Sakr, S.: Graph generators: State of the art and open challenges. ACM Comput Surv 53(2):1–30 (2020)

  12. Brockschmidt, M., Allamanis, M., Gaunt, A.L., Polozov, O.: Generative code modeling with graphs. ICLR. OpenReview.net (2019)

  13. Bu, D., Zhao, Y., Cai, L., et al.: Topological structure analysis of the protein–protein interaction network in budding yeast[J]. Nucleic acids research 31(9):2443–2450 (2003)

  14. Cayley. On Monge’s: ”Mémoire sur la Théorie des Déblais et des Remblais”. In: Proceedings of the London Mathematical Society, pp. 139–143 (1882)

  15. Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-mat: A recursive model for graph mining. In: ICDM, 2004

  16. Chang, C., Lin, C.: LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 27(1–27), 27 (2011)

    Google Scholar 

  17. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. NeurIPS (2014)

  18. Dai, H., Nazi, A., Li, Y., Dai, B., Schuurmans, D.: Scalable deep generative modeling for sparse graphs. In ICML, pages 2302–2312, (2020)

  19. Dobson, D.P., Doig, J.A.: Distinguishing enzyme structures from non-enzymes without alignments. J. Mol. Biol. 771–783,(2003)

  20. Erdős, P., Rényi, A.: On random graphs i. publicationes mathematicae (debrecen). 1959

  21. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014

  22. Goyal, P., Ferrara, E.: Graph embedding techniques, applications, and performance: a survey. Knowledge-Based Syst. 151, 78–94 (2018)

    Article  Google Scholar 

  23. Grover, A., Zweig, A., Ermon, S.: Graphite: Iterative generative modeling of graphs. In ICML, pages 2434–2444, 2019

  24. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A. C.: Improved training of wasserstein gans. In: NeurIPS, pages 5767–5777, 2017

  25. Hagberg, A. A., Schult, D. A., Swart, P. J.: Exploring network structure, dynamics, and function using NetworkX. In: Proceedings of the 7th Python in Science Conference (SciPy2008), pages 11–15, 2008

  26. Hochreiter, S., Schmidhuber, J.: Long short-term memory[J]. Neural comput. 9(8):1735–1780 (1992)

  27. Holland, P.W., Laskey, K.B., Leinhardt, S.: Stochastic block models: First steps[J]. Social networks 5(2):109–137 (1983)

  28. Jin, W., Barzilay, R., Jaakkola, T.: Junction tree variational autoencoder for molecular graph generation. In ICML, pages 2328–2337, 2018

  29. Jin, W., Barzilay, R., Jaakkola, T.: Junction tree variational autoencoder for molecular graph generation, 2019

  30. Joshi, A. K., Hitzler, P., Dong, G.: Linkgen: Multipurpose linked data generator. In ISWC, Lecture Notes in Computer Science, pages 113–121, (2016)

  31. Karrer, B., Newman, M.E.J.: Stochastic blockmodels and community structure in networks[J]. Physical review E 83(1):016–107 (2011)

  32. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. ICLR (2014)

  33. Kipf, T. N., Welling, M.: Semi-supervised classification with graph convolutional networks, 2016

  34. Kipf, T.N., Welling, M.: Variational graph auto-encoders. NeurIPS (2016)

  35. Kolda, T.G., Pinar, A., Plantenga, T.D., Seshadhri, C.: A scalable generative graph model with community structure. SIAM J. Sci, Comput (2014)

    Book  Google Scholar 

  36. Kullback, S., Leibler, R. A.: On information and sufficiency[J]. Ann. Math. Statist. 22(1):79–86 (1951)

  37. Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C., Ghahramani, Z.: Kronecker graphs: An approach to modeling networks. JMLR 985–1042,(2010)

  38. Leskovec, J., Kleinberg, J. M., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: SIGKDD, pages 177–187. ACM, (2005)

  39. Leskovec, J., Lang, K.J., Dasgupta, A., et al.: Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters[J]. Internet Math. 6(1):29–123 (2009)

  40. Li, Y., Tarlow, D., Brockschmidt, M., Zemel, R. S.: Gated graph sequence neural networks. In Y. Bengio and Y. LeCun, editors, ICLR, 2016

  41. Li, Y., Vinyals, O., Dyer, C., Pascanu, R., Battaglia, P.W.: Learning deep generative models of graphs. CoRR (2018)

  42. Liao, R., Li, Y., Song, Y., Wang, S., Hamilton, W., Duvenaud, D. K., Urtasun, R., Zemel, R.: Efficient graph generation with graph recurrent attention networks. In NeurIPS, pages 4255–4265, 2019

  43. Ma, T., Chen, J., Xiao, C.: Constrained generation of semantically valid graphs via regularizing variational autoencoders. In NeurIPS, page 7113–7124, 2018

  44. Marcelli, A., Quer, S., Squillero, G.: The maximum common subgraph problem: A portfolio approach. CoRR (2019)

  45. McCallum, A.K., Nigam, K., Rennie, J., et al.: Automating the construction of internet portals with machine learning[J]. Inf. Retr. 3(2):127–163 (2003)

  46. Mehta, N., Carin, L., Rai, P.: Stochastic blockmodels meet graph neural networks. In ICML, pages 4466–4474, 2019

  47. Moreno, S., Neville, J., Kirshner, S.: Tied kronecker product graph models to capture variance in network populations. ACM TKDD (2018)

  48. Neumann, M., Moreno, P., Antanas, L., et al.: Graph kernels for object category prediction in task-dependent robot grasping[C]//Online. In: Proceedings of the Eleventh Workshop on Mining and Learning with Graphs, p6 (2013)

  49. Pan, S., Hu, R., Long, G., Jiang, J., Yao, L., Zhang, C.: Adversarially regularized graph autoencoder for graph embedding. In IJCAI, pages 2609–2615, 2018

  50. Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online learning of social representations. In SIGKDD, pages 701–710. ACM, 2014

  51. Podda, M., Bacciu, D., Micheli, A.: A deep generative model for fragment-based molecule generation[C]//International Conference on Artificial Intelligence and Statistics. PMLR 2240–2250 (2020)

  52. Rezende, D. J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In ICML, JMLR Workshop and Conference Proceedings. JMLR.org, 2014

  53. Robins, G., Pattison, P., Kalish, Y., Lusher, D.: An introduction to exponential random graph (p*) models for social networks. Social networks 29(2):173–191 (2007)

  54. Rozemberczki, B., Davies, R., Sarkar, R., Sutton, C.: Gemsec: Graph embedding with self clustering. In: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2019, pages 65–72. ACM, 2019

  55. Salha, G., Hennequin, R., Remy, J.-B., Moussallam, M., Vazirgiannis, M.: Fastgae: Scalable graph autoencoders with stochastic subgraph decoding. arXiv preprint arXiv:2002.01910, 2020

  56. Sanfeliu, A., Fu, K.: A distance measure between attributed relational graphs for pattern recognition. IEEE Trans. Syst. Man Cybern. 353–362 (1983)

  57. Sarkar, A., Mehta, N., Rai, P.: Graph representation learning via ladder gamma variational autoencoders[C]//Proceedings of the AAAI Conference on Artificial Intelligence 34(04):5604–5611 (2003)

  58. Sen, P., Namata, G., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data. AI Mag., pages 93–106, 2008

  59. Simonovsky, M., Komodakis, N.: Graphvae: Towards generation of small graphs using variational autoencoders. In ICANN, pages 412–422, 2018

  60. Simonovsky, M., Komodakis, N.: In: In, I.C.A.N.N. (eds) Graphvae: Towards generation of small graphs using variational autoencoders, pp. 412–422. Springer (2018)

  61. Stoyanovich, J., Howe, B., Jagadish, H.V.: Responsible data management. In: Proc. VLDB Endow. 3474–3488,(2020)

  62. Su, S., Hajimirsadeghi, H., Mori, G.: Graph generation with variational recurrent neural network. CoRR (2019)

  63. Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: ACL, pp. 1556–1566. The Association for Computer Linguistics (2015)

  64. Teh, Y.W., Grür, D., Ghahramani, Z.: Stick-breaking construction for the indian buffet process. In: Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, pages 556–563. PMLR, 2007

  65. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. ICLR (2018)

  66. Watts, D., Strogatz, S.: Collective dynamics of ”small-world” networks (see comments). Nature, pages pp .440–442 (1998)

  67. Wu, L., Chen, Y., Shen, K., Guo, X., Gao, H., Li, S., Pei, J., Long, B.: Graph neural networks for natural language processing: A survey. CoRR (2021)

  68. Wu, Z., Pan, S., Chen, F., Long, G., Yu, P.S.: A comprehensive survey on graph neural networks. IEEE TNNLS 1–21,(2020)

  69. Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Yu, P.S.: A comprehensive survey on graph neural networks. IEEE TNNLS 1–21,(2020)

  70. Xia, F., Sun, K., Yu, S., Aziz, A., Wan, L., Pan, S., Liu, H.: Graph learning: A survey. CoRR, abs/2105.00696, 2021

  71. Xiao, H., Huang, M., Zhu, X.: Transg: A generative model for knowledge graph embedding. In: ACL, The Association for Computer Linguistics (2016)

  72. Xie, S., Kirillov, A., Girshick, R. B., He, K.: Exploring randomly wired neural networks for image recognition. In: ICCV, pages 1284–1293. IEEE, 2019

  73. Yang, C., Zhuang, P., Shi, W., Luu, A., Li, P.: Conditional structure generation through graph variational generative adversarial nets. NeurIPS (2019)

  74. You, J., Leskovec, J., He, K., Xie, S.: Graph structure of neural networks. In ICML, pages 10881–10891, 2020

  75. You, J., Liu, B., Ying, R., Pande, V., Leskovec, J.: In: NeurIPS, In., page, (eds.) Graph convolutional policy network for goal-directed molecular graph generation. Curran Associates Inc, pp. 6412–6422(2018)

  76. You, J., Wu, H., Barrett, C. W., Ramanujan, R., Leskovec, J.: G2SAT: learning to generate SAT formulas. In NeurIPS, pages 10552–10563, 2019

  77. You, J., Ying, R., Ren, X., Hamilton, W., Leskovec, J.: Graphrnn: Generating realistic graphs with deep auto-regressive models. In ICML, pages 5694–5703, 2018

  78. Zhang, Z., Cui, P., Zhu, W.: Deep learning on graphs: a survey. IEEE TKDE, (2020)

  79. Zhao, L., Akoglu, L.: Pairnorm: Tackling oversmoothing in gnns. ICLR (2020)

  80. Zhou, D., Zheng, L., Han, J., He, J.: A data-driven graph generative model for temporal interaction networks. In SIGKDD, pages 401–411. ACM, 2020

  81. Zhou, D., Zheng, L., Han, J., He, J.: A data-driven graph generative model for temporal interaction networks. In: SIGKDD, page 401–411, 2020

Download references

Acknowledgements

This work was supported by 2018YFB2100801, NSFC62102287, 19511101300. Ying Zhang is supported by FT170100128 and ARC DP210101393. Lu Qin is supported by ARC FT200100787. Xuemin Lin is supported by NSFC61232006, 2018YFB1003504, ARC DP200101338 and ARC DP180103096.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dawei Cheng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiang, S., Wen, D., Cheng, D. et al. General graph generators: experiments, analyses, and improvements. The VLDB Journal 31, 897–925 (2022). https://doi.org/10.1007/s00778-021-00701-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-021-00701-5

Keywords

Navigation