Abstract
Graph simulation is one of the most fundamental problems in graph processing and analytics. It can help users to generate new graphs on different scales to mimic observed real-life graphs in many applications such as social networks, biology networks, and information technology. In this paper, we focus on one of the most important types of graph generators: general graph generators, which aim to reproduce the properties of the observed graphs regardless of the domains. Though a variety of graph generators have been proposed in the literature, there are still several important research gaps in this area. In this paper, we first give an overview of the existing general graph generators, including recently emerged deep learning-based approaches. We classify them into four categories: simple model-based generators, complex model-based generators, autoencoder-based generators, and GAN-based generators. Then we conduct a comprehensive experimental evaluation of 20 representative graph generators based on 17 evaluation metrics and 12 real-life graphs. We provide a general roadmap of recommendations for how to select general graph generators under different settings. Furthermore, we propose a new method that can achieve a good trade-off between simulation quality and efficiency. To help researchers and practitioners apply general graph generators in their applications or make a comprehensive evaluation of their proposed general graph generators, we also implement an end-to-end platform that is publicly available.





Similar content being viewed by others
Change history
22 November 2021
The original version was revised due to update in sixth author affiliation.
References
Airoldi, E.M., Blei, D.M., Fienberg, S.E., et al.: Mixed membership stochastic blockmodels[J]. JMLR 1(9):1981–2014 (2008)
Akoglu, L., Faloutsos, C.: RTG: a recursive realistic graph generator using random typing[J]. Data Min. Knowl. Discov. 19(2):194–209 (2009)
Albert, R., Barabási, A.-L.: Statistical mechanics of complex networks. Reviews of modern physics, page 47, 2002
Bacciu, D., Micheli, A., Podda, M.: Graph generation by sequential edge prediction. ESANN (2019)
Bacciu, D., Micheli, A., Podda, M.: Edge-based sequential graph generation with recurrent neural networks[J]. Neurocomputing 4(16):177–189 (2020)
Bagan, G., Bonifati, A., Ciucanu, R., Fletcher, G.H.L., Lemay, A., Advokaat, N.: gmark,: Schema-driven generation of graphs and queries. IEEE TKDE 856–869,(2017)
Barrett, C. L., Beckman, R. J., Khan, M., Kumar, V. S. A., Marathe, M. V., Stretz, P. E. , Dutta, T., Lewis, B. L.: Generation and analysis of large synthetic social contact networks. In WSC, pages 1003–1014. IEEE, 2009
Batagelj, V., Brandes, U.: Efficient generation of large random networks. Phys. Rev. E, page 036113, 2005
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech.: Theory and Experiment, page P10008, 2008
Bojchevski, A., Shchur, O., Zügner, D., Günnemann, S.: Netgan: Generating graphs via random walks. In: ICML, pp. 610–619, 2018
Bonifati, A., Holubová, I., Prat-Pérez, A., Sakr, S.: Graph generators: State of the art and open challenges. ACM Comput Surv 53(2):1–30 (2020)
Brockschmidt, M., Allamanis, M., Gaunt, A.L., Polozov, O.: Generative code modeling with graphs. ICLR. OpenReview.net (2019)
Bu, D., Zhao, Y., Cai, L., et al.: Topological structure analysis of the protein–protein interaction network in budding yeast[J]. Nucleic acids research 31(9):2443–2450 (2003)
Cayley. On Monge’s: ”Mémoire sur la Théorie des Déblais et des Remblais”. In: Proceedings of the London Mathematical Society, pp. 139–143 (1882)
Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-mat: A recursive model for graph mining. In: ICDM, 2004
Chang, C., Lin, C.: LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 27(1–27), 27 (2011)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. NeurIPS (2014)
Dai, H., Nazi, A., Li, Y., Dai, B., Schuurmans, D.: Scalable deep generative modeling for sparse graphs. In ICML, pages 2302–2312, (2020)
Dobson, D.P., Doig, J.A.: Distinguishing enzyme structures from non-enzymes without alignments. J. Mol. Biol. 771–783,(2003)
Erdős, P., Rényi, A.: On random graphs i. publicationes mathematicae (debrecen). 1959
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014
Goyal, P., Ferrara, E.: Graph embedding techniques, applications, and performance: a survey. Knowledge-Based Syst. 151, 78–94 (2018)
Grover, A., Zweig, A., Ermon, S.: Graphite: Iterative generative modeling of graphs. In ICML, pages 2434–2444, 2019
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A. C.: Improved training of wasserstein gans. In: NeurIPS, pages 5767–5777, 2017
Hagberg, A. A., Schult, D. A., Swart, P. J.: Exploring network structure, dynamics, and function using NetworkX. In: Proceedings of the 7th Python in Science Conference (SciPy2008), pages 11–15, 2008
Hochreiter, S., Schmidhuber, J.: Long short-term memory[J]. Neural comput. 9(8):1735–1780 (1992)
Holland, P.W., Laskey, K.B., Leinhardt, S.: Stochastic block models: First steps[J]. Social networks 5(2):109–137 (1983)
Jin, W., Barzilay, R., Jaakkola, T.: Junction tree variational autoencoder for molecular graph generation. In ICML, pages 2328–2337, 2018
Jin, W., Barzilay, R., Jaakkola, T.: Junction tree variational autoencoder for molecular graph generation, 2019
Joshi, A. K., Hitzler, P., Dong, G.: Linkgen: Multipurpose linked data generator. In ISWC, Lecture Notes in Computer Science, pages 113–121, (2016)
Karrer, B., Newman, M.E.J.: Stochastic blockmodels and community structure in networks[J]. Physical review E 83(1):016–107 (2011)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. ICLR (2014)
Kipf, T. N., Welling, M.: Semi-supervised classification with graph convolutional networks, 2016
Kipf, T.N., Welling, M.: Variational graph auto-encoders. NeurIPS (2016)
Kolda, T.G., Pinar, A., Plantenga, T.D., Seshadhri, C.: A scalable generative graph model with community structure. SIAM J. Sci, Comput (2014)
Kullback, S., Leibler, R. A.: On information and sufficiency[J]. Ann. Math. Statist. 22(1):79–86 (1951)
Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C., Ghahramani, Z.: Kronecker graphs: An approach to modeling networks. JMLR 985–1042,(2010)
Leskovec, J., Kleinberg, J. M., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: SIGKDD, pages 177–187. ACM, (2005)
Leskovec, J., Lang, K.J., Dasgupta, A., et al.: Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters[J]. Internet Math. 6(1):29–123 (2009)
Li, Y., Tarlow, D., Brockschmidt, M., Zemel, R. S.: Gated graph sequence neural networks. In Y. Bengio and Y. LeCun, editors, ICLR, 2016
Li, Y., Vinyals, O., Dyer, C., Pascanu, R., Battaglia, P.W.: Learning deep generative models of graphs. CoRR (2018)
Liao, R., Li, Y., Song, Y., Wang, S., Hamilton, W., Duvenaud, D. K., Urtasun, R., Zemel, R.: Efficient graph generation with graph recurrent attention networks. In NeurIPS, pages 4255–4265, 2019
Ma, T., Chen, J., Xiao, C.: Constrained generation of semantically valid graphs via regularizing variational autoencoders. In NeurIPS, page 7113–7124, 2018
Marcelli, A., Quer, S., Squillero, G.: The maximum common subgraph problem: A portfolio approach. CoRR (2019)
McCallum, A.K., Nigam, K., Rennie, J., et al.: Automating the construction of internet portals with machine learning[J]. Inf. Retr. 3(2):127–163 (2003)
Mehta, N., Carin, L., Rai, P.: Stochastic blockmodels meet graph neural networks. In ICML, pages 4466–4474, 2019
Moreno, S., Neville, J., Kirshner, S.: Tied kronecker product graph models to capture variance in network populations. ACM TKDD (2018)
Neumann, M., Moreno, P., Antanas, L., et al.: Graph kernels for object category prediction in task-dependent robot grasping[C]//Online. In: Proceedings of the Eleventh Workshop on Mining and Learning with Graphs, p6 (2013)
Pan, S., Hu, R., Long, G., Jiang, J., Yao, L., Zhang, C.: Adversarially regularized graph autoencoder for graph embedding. In IJCAI, pages 2609–2615, 2018
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online learning of social representations. In SIGKDD, pages 701–710. ACM, 2014
Podda, M., Bacciu, D., Micheli, A.: A deep generative model for fragment-based molecule generation[C]//International Conference on Artificial Intelligence and Statistics. PMLR 2240–2250 (2020)
Rezende, D. J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In ICML, JMLR Workshop and Conference Proceedings. JMLR.org, 2014
Robins, G., Pattison, P., Kalish, Y., Lusher, D.: An introduction to exponential random graph (p*) models for social networks. Social networks 29(2):173–191 (2007)
Rozemberczki, B., Davies, R., Sarkar, R., Sutton, C.: Gemsec: Graph embedding with self clustering. In: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2019, pages 65–72. ACM, 2019
Salha, G., Hennequin, R., Remy, J.-B., Moussallam, M., Vazirgiannis, M.: Fastgae: Scalable graph autoencoders with stochastic subgraph decoding. arXiv preprint arXiv:2002.01910, 2020
Sanfeliu, A., Fu, K.: A distance measure between attributed relational graphs for pattern recognition. IEEE Trans. Syst. Man Cybern. 353–362 (1983)
Sarkar, A., Mehta, N., Rai, P.: Graph representation learning via ladder gamma variational autoencoders[C]//Proceedings of the AAAI Conference on Artificial Intelligence 34(04):5604–5611 (2003)
Sen, P., Namata, G., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data. AI Mag., pages 93–106, 2008
Simonovsky, M., Komodakis, N.: Graphvae: Towards generation of small graphs using variational autoencoders. In ICANN, pages 412–422, 2018
Simonovsky, M., Komodakis, N.: In: In, I.C.A.N.N. (eds) Graphvae: Towards generation of small graphs using variational autoencoders, pp. 412–422. Springer (2018)
Stoyanovich, J., Howe, B., Jagadish, H.V.: Responsible data management. In: Proc. VLDB Endow. 3474–3488,(2020)
Su, S., Hajimirsadeghi, H., Mori, G.: Graph generation with variational recurrent neural network. CoRR (2019)
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: ACL, pp. 1556–1566. The Association for Computer Linguistics (2015)
Teh, Y.W., Grür, D., Ghahramani, Z.: Stick-breaking construction for the indian buffet process. In: Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, pages 556–563. PMLR, 2007
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. ICLR (2018)
Watts, D., Strogatz, S.: Collective dynamics of ”small-world” networks (see comments). Nature, pages pp .440–442 (1998)
Wu, L., Chen, Y., Shen, K., Guo, X., Gao, H., Li, S., Pei, J., Long, B.: Graph neural networks for natural language processing: A survey. CoRR (2021)
Wu, Z., Pan, S., Chen, F., Long, G., Yu, P.S.: A comprehensive survey on graph neural networks. IEEE TNNLS 1–21,(2020)
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Yu, P.S.: A comprehensive survey on graph neural networks. IEEE TNNLS 1–21,(2020)
Xia, F., Sun, K., Yu, S., Aziz, A., Wan, L., Pan, S., Liu, H.: Graph learning: A survey. CoRR, abs/2105.00696, 2021
Xiao, H., Huang, M., Zhu, X.: Transg: A generative model for knowledge graph embedding. In: ACL, The Association for Computer Linguistics (2016)
Xie, S., Kirillov, A., Girshick, R. B., He, K.: Exploring randomly wired neural networks for image recognition. In: ICCV, pages 1284–1293. IEEE, 2019
Yang, C., Zhuang, P., Shi, W., Luu, A., Li, P.: Conditional structure generation through graph variational generative adversarial nets. NeurIPS (2019)
You, J., Leskovec, J., He, K., Xie, S.: Graph structure of neural networks. In ICML, pages 10881–10891, 2020
You, J., Liu, B., Ying, R., Pande, V., Leskovec, J.: In: NeurIPS, In., page, (eds.) Graph convolutional policy network for goal-directed molecular graph generation. Curran Associates Inc, pp. 6412–6422(2018)
You, J., Wu, H., Barrett, C. W., Ramanujan, R., Leskovec, J.: G2SAT: learning to generate SAT formulas. In NeurIPS, pages 10552–10563, 2019
You, J., Ying, R., Ren, X., Hamilton, W., Leskovec, J.: Graphrnn: Generating realistic graphs with deep auto-regressive models. In ICML, pages 5694–5703, 2018
Zhang, Z., Cui, P., Zhu, W.: Deep learning on graphs: a survey. IEEE TKDE, (2020)
Zhao, L., Akoglu, L.: Pairnorm: Tackling oversmoothing in gnns. ICLR (2020)
Zhou, D., Zheng, L., Han, J., He, J.: A data-driven graph generative model for temporal interaction networks. In SIGKDD, pages 401–411. ACM, 2020
Zhou, D., Zheng, L., Han, J., He, J.: A data-driven graph generative model for temporal interaction networks. In: SIGKDD, page 401–411, 2020
Acknowledgements
This work was supported by 2018YFB2100801, NSFC62102287, 19511101300. Ying Zhang is supported by FT170100128 and ARC DP210101393. Lu Qin is supported by ARC FT200100787. Xuemin Lin is supported by NSFC61232006, 2018YFB1003504, ARC DP200101338 and ARC DP180103096.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xiang, S., Wen, D., Cheng, D. et al. General graph generators: experiments, analyses, and improvements. The VLDB Journal 31, 897–925 (2022). https://doi.org/10.1007/s00778-021-00701-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-021-00701-5