Abstract
Group interactions arise in our daily lives (email communications, on-demand ride sharing, and comment interactions on online communities, to name a few), and they together form hypergraphs that evolve over time. Given such temporal hypergraphs, how can we describe their underlying design principles? If their sizes and time spans are considerably different, how can we compare their structural and temporal characteristics? In this work, we define 96 temporal hypergraph motifs (TH-motifs) and propose the relative occurrences of their instances as an answer to the above questions. TH-motifs categorize the relational and temporal dynamics among three connected hyperedges that appear within a short time. For scalable analysis, we develop THyMe \(^{+}\), a fast and exact algorithm for counting the instances of TH-motifs in massive hypergraphs, and we show that THyMe \(^{+}\) is up to 2,163\(\times \) faster while requiring less space than baseline approaches. In addition to exact counting algorithms, we design three versions of sampling algorithms for approximate counting. We theoretically analyze the accuracy of the proposed methods, and we empirically show that the most advanced algorithm, , is up to \(11.1\times \) more accurate than baseline approaches. Using the algorithms, we investigate 11 real-world temporal hypergraphs from various domains. We demonstrate that TH-motifs provide important information useful for downstream tasks and reveal interesting patterns, including the striking similarity between temporal hypergraphs from the same domain.
Similar content being viewed by others
References
Alstott J, Bullmore E, Plenz D (2014) powerlaw: a python package for analysis of heavy-tailed distributions. PLoS ONE 9(1):e85777
Amburg I, Veldt N, Benson A (2020) Clustering in graphs and hypergraphs with categorical edge labels. In: WWW
Arenas A, Fernandez A, Fortunato S, Gomez S (2008) Motif-based communities in complex networks. J Phys A Math Theor 41(22):224001
Benson AR, Gleich DF, Leskovec J (2016) Higher-order organization of complex networks. Science 353(6295):163–166
Benson AR, Abebe R, Schaub MT, Jadbabaie A, Kleinberg J (2018) Simplicial closure and higher-order link prediction. Proc Natl Acad Sci 115(48):E11221–E11230
Benson AR, Kumar R, Tomkins A (2018b) Sequences of sets. In: KDD
Borgatti SP, Everett MG (1997) Network analysis of 2-mode data. Soc Netw 19(3):243–269
Chodrow PS (2020) Configuration models of random hypergraphs. J Complex Netw 8(3):cnaa018
Choe M, Yoo J, Lee G, Baek W, Kang U, Shin K (2022) Midas: Representative sampling from real-world hypergraphs. In: WWW
Choo H, Shin K (2022) On the persistence of higher-order interactions in real-world hypergraphs. In: SDM
Clauset A, Shalizi CR, Newman ME (2009) Power-law distributions in empirical data. SIAM Rev 51(4):661–703
Do MT, Yoon Se, Hooi B, Shin K (2020) Structural patterns and generative models of real-world hypergraphs. In: KDD
Feng Y, You H, Zhang Z, Ji R, Gao Y (2019) Hypergraph neural networks. In: AAAI
Gurukar S, Ranu S, Ravindran B (2015) Commit: A scalable approach to mining communication motifs from dynamic networks. In: SIGMOD
Hwang T, Tian Z, Kuangy R, Kocher JP (2008) Learning on weighted hypergraphs to integrate protein interactions and gene expressions for cancer outcome prediction. In: ICDM
Karypis G, Aggarwal R, Kumar V, Shekhar S (1999) Multilevel hypergraph partitioning: Applications in vlsi domain. TLVLSI 7(1):69–79
Kim S, Choe M, Yoo J, Shin K (2022) Reciprocity in directed hypergraphs: Measures, findings, and generators. In: ICDM
Ko J, Kook Y, Shin K (2022) Growth patterns and models of real-world hypergraphs. Knowl Inf Syst 64(11):2883–2920
Kook Y, Ko J, Shin K (2020) Evolution of real-world hypergraphs: Patterns and models without oracles. In: ICDM
Kovanen L, Karsai M, Kaski K, Kertész J, Saramäki J (2011) Temporal motifs in time-dependent networks. J Stat Mech Theory Exp 11:P11005
Lee G, Shin K (2021) Thyme+: Temporal hypergraph motifs and fast algorithms for exact counting. In: ICDM
Lee G, Ko J, Shin K (2020) Hypergraph motifs: concepts, algorithms, and discoveries. PVLDB 13:2256–2269
Lee G, Choe M, Shin K (2021) How do hyperedges overlap in real-world hypergraphs?–patterns, measures, and generators. In: WWW
Lee G, Choe M, Shin K (2022a) Hashnwalk: Hash and random walk based anomaly detection in hyperedge streams. In: IJCAI
Lee G, Yoo J, Shin K (2022b) Mining of real-world hypergraphs: Patterns, tools, and generators. In: CIKM
Lee JB, Rossi RA, Kong X, Kim S, Koh E, Rao A (2019) Graph convolutional networks with motif-based attention. In: CIKM
Li P, Milenkovic O (2017) Inhomogoenous hypergraph clustering with applications. In: NIPS
Li PZ, Huang L, Wang CD, Lai JH (2019) Edmot: An edge enhancement approach for motif-aware community detection. In: KDD
Li Y, Lou Z, Shi Y, Han J (2018) Temporal motifs in heterogeneous information networks. In: MLG workshop
Liu P, Benson AR, Charikar M (2019) Sampling methods for counting temporal motifs. In: WSDM
Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2002) Network motifs: simple building blocks of complex networks. Science 298(5594):824–827
Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S, Ayzenshtat I, Sheffer M, Alon U (2004) Superfamilies of evolved and designed networks. Science 303(5663):1538–1542
Paranjape A, Benson AR, Leskovec J (2017) Motifs in temporal networks. In: WSDM
Redmond U, Cunningham P (2013) Temporal subgraph isomorphism. In: ASONAM
Rossi RA, Ahmed NK, Koh E (2018a) Higher-order network representation learning. In: WWW Companion
Rossi RA, Zhou R, Ahmed NK (2018) Deep inductive graph representation learning. IEEE TKDE 32(3):438–452
Rossi RA, Ahmed NK, Carranza A, Arbour D, Rao A, Kim S, Koh E (2020) Heterogeneous graphlets. ACM TKDD 15(1):1–43
Rossi RA, Ahmed NK, Koh E, Kim S, Rao A, Abbasi-Yadkori Y (2020b) A structural graph representation learning framework. In: WSDM
Shen-Orr SS, Milo R, Mangan S, Alon U (2002) Network motifs in the transcriptional regulation network of escherichia coli. Nat Genet 31(1):64–68
Tsourakakis CE, Pachocki J, Mitzenmacher M (2017) Scalable motif-aware graph clustering. In: WWW
Yadati N, Nimishakavi M, Yadav P, Nitin V, Louis A, Talukdar P (2018) Hypergcn: A new method of training graph convolutional networks on hypergraphs. arXiv preprint arXiv:1809.02589
Yang D, Qu B, Yang J, Cudre-Mauroux P (2019) Revisiting user mobility and social relationships in lbsns: A hypergraph embedding approach. In: WWW
Yin H, Benson AR, Leskovec J, Gleich DF (2017) Local higher-order graph clustering. In: KDD
Yoon Se, Song H, Shin K, Yi Y (2020) How much and when do we need higher-order information in hypergraphs? a case study on hyperedge prediction. In: WWW
Yu J, Tao D, Wang M (2012) Adaptive hypergraph learning and its application in image classification. TIP 21(7):3262–3272
Yu Y, Lu Z, Liu J, Zhao G, Wen Jr (2019) Rum: Network representation learning using motifs. In: ICDE
Zhao H, Xu X, Song Y, Lee DL, Chen Z, Gao H (2018) Ranking users in social networks with higher-order structures. In: AAAI
Acknowledgements
This work was supported by National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2020R1C1C1008296) and Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2019-0-00075, Artificial Intelligence Graduate School Program (KAIST)).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Details of Dynamic Programming (DP)
In this subsection, we provide details of the Dynamic Programming (DP), a straightforward extension of temporal network motif counting [33]. As shown in Algorithm 6, it utilizes a dynamic programming approach to reduce the redundant enumeration of every instance of TH-motifs of the input temporal hypergraph T. The procedure count (lines 9–19) counts the instances of TH-motifs that induce a set of \(\ell \) connected static hyperedges. That is, given a set \(s=\{\tilde{e}_1,\dots ,\tilde{e}_{\ell }\}\) of \(\ell \) connected static hyperedges, count first constructs a time-sorted sequence e(s) of temporal hyperedges whose nodes is one of s (line 10). It also introduces a map C that maintains the counts of ordered hyperedges of length at most \(\ell \). Then count scans through the temporal hyperedges in e(s) and tracks the subsequences that occur within the temporal window that spans temporal hyperedges within \(\delta \) time units. As the temporal window slides through the temporal hyperedges e(s), the count of the sequences are computed based on the subsequences counted in C. Refer to [33] for more intuition behind this dynamic programming formulation.
Appendix B: Details of Discrete Time Partitioning (DTP)
In this subsection, we provide details of the Discrete Time Partitioning (DTP), a straightforward extension of sampling algorithm for approximate temporal network motif counting [30]. As shown in Algorithm 7, it begins with randomly drawing a shift s from \(\{-c\delta +1,\cdots ,0\}\) for the predefined input integer \(c>0\) that controls the size of the sampling windows (line 3). Then, based on the selected shift s, the time is discretely partitioned into \(c\delta \)-sized intervals:
The probability of an instance \(\langle e_i, e_j, e_k \rangle \) of TH-motif to be completely contained within an interval \(\mathscr {I}_s\) is:
Then, for each interval \(I\in \mathscr {I}_s\), DTP constructs a sub-temporal hypergraph that consists of temporal hyperedges within I (line 6). From the partial temporal hypergraph, it uses THyMe to exhaustively discover the instances of TH-motifs. It associates a weighted count of the number of instances of TH-motifs by incrementing the counts by the inverse of the probability to be completely contained in the interval I. By adding up the weighted counts of all intervals, it yields an unbiased estimation of the number of instances of each TH-motif.
To speed up the estimation, DTP incorporates importance sampling to pick only a subset of intervals and combines the counts from them. Let \(I_j\) be the j-th interval. Specifically, it samples an interval \(I_j\in \mathscr {I}_s\) independently with the predefined probability \(z_j\) (line 5). Here, DTP uses the ratio of the number of temporal hyperedges that arrived within the interval, which can be easily obtained from the input temporal hypergraph. Formally, the ratio \(z_j\) is
where r is a constant that controls the magnitude of the probability. Then, the number of instances counted in the interval is additionally weighted by \(\frac{1}{z_j}\) (line 9). These computations are repeated b times to reduce the variance of the estimation. Refer to [30] for details of the original algorithm.
Appendix C: Sub-algorithms of THyMe and THyMe+
In this subsection, we provide pseudocode of four sub-algorithms of THyMe and THyMe \(^{+}\). These algorithms are used for analyzing the time complexity of THyMe and THyMe \(^{+}\) in Sect. 4.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lee, G., Shin, K. Temporal hypergraph motifs. Knowl Inf Syst 65, 1549–1586 (2023). https://doi.org/10.1007/s10115-023-01837-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-023-01837-2