Skip to main content
Log in

Growth patterns and models of real-world hypergraphs

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

What kind of macroscopic structural and dynamical patterns can we observe in real-world hypergraphs? What can be underlying local dynamics on individuals, which ultimately lead to the observed patterns, beyond apparently random evolution? Graphs, which provide effective ways to represent pairwise interactions among entities, fail to represent group interactions (e.g., collaborations of three or more researchers, etc.). Regarded as a generalization of graphs, hypergraphs allowing for various sizes of edges prove fruitful in addressing this limitation. However, the increased complexity makes it challenging to understand hypergraphs as thoroughly as graphs. In this work, we closely examine seven structural and dynamical properties of real hypergraphs from six domains. To this end, we define new measures, extend notions of common graph properties to hypergraphs, and assess the significance of observed patterns by comparison with a null model and statistical tests. We also propose HyperFF, a stochastic model for generating realistic hypergraphs. Its merits are threefold: (a) Realistic: it successfully reproduces all seven patterns, in addition to five patterns established in previous studies, (b) Self-contained: unlike previously proposed models, it does not rely on oracles (i.e., unexplainable external information) at all, and it is parameterized by just two scalars, and (c) Emergent: it relies on simple and interpretable mechanisms on individual entities, which do not trivially enforce but surprisingly lead to macroscopic properties. While HyperFF is mathematically intractable, we provide theoretical justifications and mathematical analysis based on its simplified version.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Notes

  1. This work is an extended version of [14], which was presented at the 20th IEEE International Conference on Data Mining. In this extended version, we introduce a simple version of HyperFF, and based on it, we theoretically justify densification and heavy-tailed degree distributions. For additional experiments, we test the stability of HyperFF (Fig. 9 in Sect. 6.2), examine the overlapping patterns of hyperedges (Fig. 11 in Sect. 6.2), and the occurrences of 26 hypergraph motifs in hypergraphs generated by HyperFF (Figs. 12 and 13 in Sect. 6.2). Additionally, we analyze the effect of the parameters p and q on the structures and dynamics of what HyperFF generates (Figs. 14 and 15 in Sect. 6.3). Lastly, we perform an ablation study, where we take six variants of HyperFF into consideration (Figs. 16 and 17 in Sect. 6.4).

  2. Linear interpolation is used to find such d.

  3. Note that, compared to that in static CGAH the tree distance between leaf nodes has doubled in dynamic CGAH. Thus, we divide the exponent by 2 when computing the probability.

  4. Formally, the \( n \)-level decomposed graph of a hypergraph \(G=(V, E)\) is defined as \(G_{(n)}=(V_{(n)}, E_{(n)})\) where

    $$\begin{aligned}&V_{(n)} := \{v_{(n)} \in 2^{V} : |v_{(n)}| = n \text { and } \exists e\in E \text { s.t. } v_{(n)} \subseteq e \}, \\&{E_{(n)} := \left\{ \{u_{(n)},v_{(n)}\} \in \left( {\begin{array}{c}V_{(n)}\\ 2\end{array}}\right) : \exists e\in E \text { s.t. } u_{(n)} \cup v_{(n)} \subseteq e \right\} .} \end{aligned}$$
  5. The egonet of a node v is defined as \(\{e \in E \mid v \in e\}\), the set of hyperedges containing v.

References

  1. Akoglu L, McGlohon M, Faloutsos C (2010) Oddball: spotting anomalies in weighted graphs. In: PAKDD

  2. Alstott J, Bullmore DP (2014) powerlaw: a python package for analysis of heavy-tailed distributions. PloS one 9(1)

  3. Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512

    Article  MathSciNet  Google Scholar 

  4. Benson AR, Abebe R, Schaub MT, Jadbabaie A, Kleinberg J (2018) Simplicial closure and higher-order link prediction. Proc Natl Acad Sci 115(48):E11221–E11230

    Article  Google Scholar 

  5. Benson AR, Kumar R, Tomkins A (2018) Sequences of sets. In: KDD

  6. Choe M, Yoo J, Lee G, Baek W, Kang U, Shin K (2022) Midas: representative sampling from real-world hypergraphs. In: WWW

  7. Clauset A, Shalizi CR, Newman ME (2009) Power-law distributions in empirical data. SIAM Rev 51(4):661–703

    Article  MathSciNet  Google Scholar 

  8. Do MT, Yoon Se, Hooi B, Shin K (2020) Structural patterns and generative models of real-world hypergraphs. In: KDD

  9. Drobyshevskiy M, Turdakov D (2019) Random graph modeling: a survey of the concepts. ACM Comput Surv (CSUR) 52(6):1–36

    Article  Google Scholar 

  10. Erdős P, Rényi A et al (1960) On the evolution of random graphs. Publ Math Inst Hung Acad Sci 5(1):17–60

    MathSciNet  MATH  Google Scholar 

  11. Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationships of the internet topology. ACM SIGCOMM Comput Commun Rev 29(4):251–262

    Article  Google Scholar 

  12. Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826

    Article  MathSciNet  Google Scholar 

  13. Kang U, Tsourakakis CE, Faloutsos C (2011) Pegasus: mining peta-scale graphs. Knowl Inf Syst 27(2):303–325

    Article  Google Scholar 

  14. Kook Y, Ko J, Shin K (2020) Evolution of real-world hypergraphs: patterns and models without oracles. In: ICDM

  15. Lee G, Shin K (2021) Thyme+: temporal hypergraph motifs and fast algorithms for exact counting. In: ICDM

  16. Lee G, Ko J, Shin K (2020) Hypergraph motifs: concepts, algorithms, and discoveries. PVLDB 13(11):2256–2269

    Google Scholar 

  17. Lee G, Choe M, Shin K (2021) How do hyperedges overlap in real-world hypergraphs?—patterns, measures, and generators. In: TheWebConf

  18. Lee K, Ko J, Shin K (2022) Slugger: lossless hierarchical summarization of massive graphs. In: ICDE

  19. Leskovec J, Faloutsos C (2006) Sampling from large graphs. In: KDD

  20. Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 1(1):2–es

  21. Leskovec J, Chakrabarti D, Kleinberg J, Faloutsos C, Ghahramani Z (2010) Kronecker graphs: an approach to modeling networks. J Mach Learn Res 11:985–1042

    MathSciNet  MATH  Google Scholar 

  22. Mastrandrea R, Fournet J, Barrat A (2015) Contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys. PLoS ONE 10(9):e0136497

    Article  Google Scholar 

  23. McLachlan GJ (1987) On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. J Roy Stat Soc Ser C (Appl Stat) 36(3):318–324

    Google Scholar 

  24. Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S, Ayzenshtat I, Sheffer M, Alon U (2004) Superfamilies of evolved and designed networks. Science 303(5663):1538–1542

    Article  Google Scholar 

  25. Murphy RC, Wheeler KB, Barrett BW, Ang JA (2010) Introducing the graph 500. Cray Users Group (CUG) 19:45–74

    Google Scholar 

  26. Sala A, Cao L, Wilson C, Zablit R, Zheng H, Zhao BY (2010) Measurement-calibrated graph models for social network experiments. In: WWW

  27. Sales-Pardo M, Guimera R, Moreira AA, Amaral LAN (2007) Extracting the hierarchical organization of complex systems. Proc Natl Acad Sci 104(39):15224–15229

    Article  Google Scholar 

  28. Salihoglu S, Widom J (2013) GPS: a graph processing system. In: SSDBM

  29. Shin K, Eliassi-Rad T, Faloutsos C (2018) Patterns and anomalies in k-cores of real-world graphs with applications. Knowl Inf Syst 54(3):677–710

    Article  Google Scholar 

  30. Tsourakakis CE (2008) Fast counting of triangles in large real networks without counting: Algorithms and laws. In: ICDM, pp 608–617

  31. Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440–442

    Article  Google Scholar 

  32. Woolf B (1957) The log likelihood ratio test (the g-test). Ann Hum Genet 21(4):397–409

    Article  Google Scholar 

  33. Zhang Y, Humbert M, Surma B, Manoharan P, Vreeken J, Backes M (2020) Towards plausible graph anonymization. In: NDSS

Download references

Acknowledgements

This work was supported by Young Scientist Fellowship funded by the Institute for Basic Science (IBS-R029-Y2), National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2020R1C1C1008296), and Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2019-0-00075, Artificial Intelligence Graduate School Program (KAIST)).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kijung Shin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ko, J., Kook, Y. & Shin, K. Growth patterns and models of real-world hypergraphs. Knowl Inf Syst 64, 2883–2920 (2022). https://doi.org/10.1007/s10115-022-01739-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-022-01739-9

Keywords

Navigation