Continuous-time dynamic graph learning based on spatio-temporal random walks

Sheng, Jinfang; Zhang, Yifan; Wang, Bin

doi:10.1007/s11227-024-06881-5

Continuous-time dynamic graph learning based on spatio-temporal random walks

Published: 11 January 2025

Volume 81, article number 389, (2025)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Jinfang Sheng¹^na1,
Yifan Zhang¹^na1 &
Bin Wang¹

180 Accesses
Explore all metrics

Abstract

The application of dynamic graph representation learning in the processing of dynamic systems, such as social networks and transportation networks, has increased in recent times due to its ability to efficiently integrate topological and temporal information into a compact representation. Continuous-time dynamic graphs (CTDGs) have received considerable attention due to their capacity to retain precise temporal information. Existing methods based on random walk techniques often use time-biased sampling to extract dynamic graph patterns, neglecting the topological properties of the graph. Additionally, previous anonymous walks do not share node identifiers, failing to fully leverage the correlations between network patterns, which play a crucial role in predicting future interactions. Consequently, this study focuses on methods related to CTDGs. This paper presents a novel continuous-time dynamic graph learning method based on spatio-temporal random walks, which makes three main contributions: (i) By considering temporal constraints and topological structures, our method extracts diverse expressive patterns from CTDGs; (ii) It introduces the hitting counts of the nodes at a certain position as the node’s relative identity. This approach fully leverages the correlation of network patterns, ensuring that the pattern structure remains consistent even after removing node identities; (iii) An attention mechanism is employed to aggregate walk encodings, allowing the importance of different walks to be distinguished. This facilitates a more precise delineation of the relationships and structural attributes between nodes, thereby enhancing the precision and expressive power of node representations. The proposed method demonstrates superior performance compared to the average strongest baseline, achieving gains of 2.72% and 2.46% in all transductive and inductive link prediction tasks, respectively. Additionally, it attains up to an 8.7% improvement on specific datasets. Furthermore, it exhibits the second best overall performance in dynamic node classification tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

evolve2vec: Learning Network Representations Using Temporal Unfolding

A Comparative Study of Representation Learning Techniques for Dynamic Networks

Dynamic network link prediction based on random walking and time aggregation

Article 28 February 2023

Data availibility

The authors confirm that the data supporting the findings of this study are available within the article

References

Kazemi SM, Goel R, Jain K, Kobyzev I, Sethi A, Forsyth P, Poupart P (2020) Representation learning for dynamic graphs: a survey. J Mach Learn Res 21(70):1–73
MathSciNet MATH Google Scholar
Alvarez-Rodriguez U, Battiston F, Arruda GF, Moreno Y, Perc M, Latora V (2021) Evolutionary dynamics of higher-order interactions in social networks. Nat Hum Behav 5(5):586–595
Article Google Scholar
Yu L, Liu Z, Sun L, Du B, Liu C, Lv W (2023) Continuous-time user preference modelling for temporal sets prediction. IEEE Trans Knowl Data Eng 36:1475–1488
Article MATH Google Scholar
Sun Y, Jiang X, Hu Y, Duan F, Guo K, Wang B, Gao J, Yin B (2022) Dual dynamic spatial-temporal graph convolution network for traffic prediction. IEEE Trans Intell Transp Syst 23(12):23680–23693
Article Google Scholar
Simmel G (1950) The sociology of Georg Simmel, vol 92892. Simon and Schuster, New York
MATH Google Scholar
Granovetter MS (1973) The strength of weak ties. Am J Sociol 78(6):1360–1380
Article MATH Google Scholar
Pareja A, Domeniconi G, Chen J, Ma T, Suzumura T, Kanezashi H, Kaler T, Schardl T, Leiserson C (2020) EvolveGCN: evolving graph convolutional networks for dynamic graphs. Proc AAAI Conf Artif Intell 34:5363–5370
Google Scholar
Zhao L, Song Y, Zhang C, Liu Y, Wang P, Lin T, Deng M, Li H (2019) T-GCN: a temporal graph convolutional network for traffic prediction. IEEE Trans Intell Transp Syst 21(9):3848–3858
Article MATH Google Scholar
Seo Y, Defferrard M, Vandergheynst P, Bresson X (2018) Structured sequence modeling with graph convolutional recurrent networks. In: Neural Information Processing: 25th International Conference, ICONIP 2018, Siem Reap, Cambodia, December 13-16, 2018, Proceedings, Part I 25. Springer, pp 362–373
Wang J, Zhu W, Song G, Wang L (2022) Streaming graph neural networks with generative replay. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp 1878–1888
Goyal P, Kamra N, He X, Liu Y (2018). DynGEM: deep embedding method for dynamic graphs. arXiv preprint arXiv:1805.11273
Sankar A, Wu Y, Gou L, Zhang W, Yang H (2020). DySAT: Deep neural representation learning on dynamic graphs via self-attention networks. In: Proceedings of the 13th International Conference on Web Search and Data Mining, pp 519–527
Xu D, Ruan C, Korpeoglu E, Kumar S, Achan K (2020) Inductive representation learning on temporal graphs. In: International Conference on Learning Representations
Rossi E, Chamberlain B, Frasca F, Eynard D, Monti F, Bronstein M (2020) Temporal graph networks for deep learning on dynamic graphs. arXiv preprint arXiv:2006.10637
Wang Y, Chang YY, Liu Y, Leskovec J, Li P (2021) Inductive representation learning in temporal networks via causal anonymous walks. In: International Conference on Learning Representations (ICLR)
Souza A, Mesquita D, Kaski S, Garg V (2022) Provably expressive temporal graph networks. Adv Neural Inf Process Syst 35:32257–32269
Google Scholar
Cong W, Zhang S, Kang J, Yuan B, Wu H, Zhou X, Tong H, Mahdavi M (2023) Do we really need complicated model architectures for temporal networks? In: The Eleventh International Conference on Learning Representations
Trivedi R, Farajtabar M, Biswal P, Zha H (2019) DyRep: Learning representations over dynamic graphs. In: International Conference on Learning Representations
Wang L, Chang X, Li S, Chu Y, Li H, Zhang W, He X, Song L, Zhou J, Yang H (2021) TCL: transformer-based dynamic graph modelling via contrastive learning. arXiv preprint arXiv:2105.07944
Kumar S, Zhang X, Leskovec J (2019). Predicting dynamic embedding trajectory in temporal interaction networks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp 1269–1278
Wang X, Lyu D, Li M, Xia Y, Yang Q, Wang X, Wang X, Cui P, Yang Y, Sun B, et al (2021) APAN: Asynchronous propagation attention network for real-time temporal graph embedding. In: Proceedings of the 2021 International Conference on Management of Data, pp 2628–2638
Nguyen GH, Lee JB, Rossi RA, Ahmed NK, Koh E, Kim S (2018) Dynamic network embeddings: from random walks to temporal random walks. In: 2018 IEEE International Conference on Big Data (Big Data), IEEE, pp 1085–1092
Zhang M, Xu B, Wang L (2023) Dynamic network link prediction based on random walking and time aggregation. Int J Mach Learn Cybern 14(8):2867–2875
Article MATH Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Processing Syst 30:5998–6008
Google Scholar
Feng Z, Wang R, Wang T, Song M, Wu S, He S (2024) A comprehensive survey of dynamic graph neural networks: Models, frameworks, benchmarks, experiments and challenges. arXiv preprint arXiv:2405.00476
Yang L, Chatelain C, Adam S (2024) Dynamic graph representation learning with neural networks: a survey. IEEE Access 12:43460–43484
Article MATH Google Scholar
Trivedi R, Dai H, Wang Y, Song L (2017) Know-evolve: deep temporal reasoning for dynamic knowledge graphs. In: International Conference on Machine Learning, PMLR, pp 3462–3471
Mikolov T, Chen K, Corrado G, Dean J (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26
Perozzi B, Al-Rfou R, Skiena S (2014) DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 701–710
Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 855–864
Liu Z, Che W, Wang S, Xu J, Yin H (2023) A large-scale data security detection method based on continuous time graph embedding framework. J Cloud Comput 12(1):89
Article MATH Google Scholar
Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph attention networks. In: International Conference on Learning Representations
Wang X, Ji H, Shi C, Wang B, Ye Y, Cui P, Yu PS (2019) Heterogeneous graph attention network. In: The World Wide Web Conference, pp 2022–2032
Zhang Y, Shi Z, Feng D, Zhan X-X (2019) Degree-biased random walk for large-scale network embedding. Futur Gener Comput Syst 100:198–209
Article MATH Google Scholar
Jin M, Li Y-F, Pan S (2022) Neural temporal walks: Motif-aware representation learning on continuous-time dynamic graphs. Adv Neural Inf Process Syst 35:19874–19886
MATH Google Scholar
Poursafaei F, Huang S, Pelrine K, Rabbany R (2022) Towards better evaluation for dynamic link prediction. Adv Neural Inf Process Syst 35:32928–32941
Google Scholar
Yu L, Sun L, Du B, Lv W (2023) Towards better dynamic graph learning: new architecture and unified library. Adv Neural Inf Process Syst 36:67686–67700
Google Scholar

Download references

Acknowledgements

This research was supported by the Key Research and Development Program of Hunan Province (Grant No. 2023SK2038).

Author information

Jinfang Sheng and Yifan Zhang have contributed equally to this work.

Authors and Affiliations

School of Computer Science and Engineering, Central South University, No.932 South Lushan Road, Changsha, 410083, Hunan, China
Jinfang Sheng, Yifan Zhang & Bin Wang

Authors

Jinfang Sheng
View author publications
You can also search for this author inPubMed Google Scholar
Yifan Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Bin Wang
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

These authors contributed equally to this work.

Corresponding author

Correspondence to Bin Wang.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A Notations

See Table 7

Table 7 Summary of important notations

Full size table

B Theoretical analysis of the combined sampling probabilities

1.1 B.1 Theoretical background

In dynamic graph learning, both temporal and spatial information are crucial for understanding the relationships between nodes. Temporal random walks emphasize the relevance of time adjacency, while spatial random walks focus on the connectivity between nodes. To leverage both types of information, it is essential to consider their combined effects.

Temporal sampling probability $P_t (a)$: The temporal sampling probability reflects the influence of nodes around a specific time t. The formula is given by:

$$\begin{aligned} P_t(a) = \frac{\exp(\alpha (t_a - t))}{\sum \limits _{a' \in \mathcal {G}_{u,t}} \exp(\alpha (t_{a'} - t))} \end{aligned}$$

(18)

The closer the timestamp $t_a$ of node a is to the current time t, the higher its probability. This design aims to prioritize nodes that are temporally adjacent, thereby capturing temporal dynamics effectively.

Spatial sampling probability $P_s (a)$: The spatial sampling probability considers the connectivity of nodes. The formula is:

$$\begin{aligned} P_s(a) = \frac{\exp(-\beta / d_a)}{\sum \limits _{a' \in \mathcal {G}_{u,t}} \exp(-\beta / d_{a'})} \end{aligned}$$

(19)

The higher the degree $d_a$ of node a, the greater its probability. Nodes with higher degrees are often in more critical positions in the network, so they should be given higher priority during sampling.

1.2 B.2 Theoretical analysis: necessity of averaging

In many practical scenarios, temporal and spatial information are interconnected. In dynamic networks, the connectivity of a node (spatial) may influence its interaction time (temporal) and vice versa. Thus, assuming equal contributions from $P_t (a)$ and $P_s (a)$ can effectively reflect the complex relationships present in real-world situations.

In certain situations, nodes that are close in time may not be connected spatially, and vice versa. Therefore, solely considering one factor might result in the loss of crucial information. By averaging, we can balance the effects of both, enhancing the model’s adaptability and generalization capability.

1.3 B.3 Mathematical derivation

To rigorously illustrate this theory, we can approach it from a probability theory perspective.

Assuming $P_t (a)$ and $P_s (a)$ are valid probability distributions, we have:

$$\begin{aligned} & \sum \limits _{a' \in \mathcal {G}_{u,t}} P_t (a) = 1 \end{aligned}$$

(20)

$$\begin{aligned} & \sum \limits _{a' \in \mathcal {G}_{u,t}} P_s (a) = 1 \end{aligned}$$

(21)

Defining $P_{\text{combined}}$ as:

$$\begin{aligned} P_{\text{combined}} = \frac{P_t (a) + P_s (a)}{2} \end{aligned}$$

(22)

We then have:

$$\begin{aligned} \sum \limits _{a' \in \mathcal {G}_{u,t}} P_{\text{combined}} = \frac{1+1}{2} = 1 \end{aligned}$$

(23)

This indicates that $P_{\text{combined}}$ is also a valid probability distribution.

Combining the temporal and spatial information through averaging $P_t (a)$ and $P_s (a)$ is both reasonable and necessary. This approach not only considers the unique contributions of both factors but also enhances the model’s performance in dynamic graph learning. In summary, taking into account the influences of time and space under different conditions allows us to more effectively uncover the diversity and complexity within dynamic systems.

C Time complexity analysis

In Algorithm 1, the time complexity of the outer loop is O(l), the middle loop is O(C), and the inner loop, which traverses neighbors, has a complexity of O(d), where d is the maximum degree of the nodes. Therefore, the overall time complexity is O(lCd).

D Experimental setting

1.1 D.1 Dataset source

Most of the used original dynamic graph datasets come from Origin Datasets, which can be downloaded here. For convenience, you can also directly download the processed data package from processed_data. Here are the UNTrade contains the food and agriculture trade between 181 nations for more than 30 years. The weight of each link indicates the total sum of normalized agriculture import or export values between two particular countries of these datasets:

Wikipedia is a bipartite interaction graph that contains the edits on Wikipedia pages over a month. Nodes represent users and pages, and links denote the editing behaviors with timestamps. Each link is associated with a 172-dimensional Linguistic Inquiry and Word Count (LIWC) feature. This dataset additionally contains dynamic labels that indicate whether users are temporarily banned from editing.
Reddit is bipartite and records the posts of users under subreddits during one month. Users and subreddits are nodes, and links are the timestamped posting requests. Each link has a 172-dimensional LIWC feature. This dataset also includes dynamic labels representing whether users are banned from posting.
Enron records the email communications between employees of the ENRON energy corporation over three years.
UCI is an online communication network, where nodes are university students and links are messages posted by students.
Flights is a dynamic flight network that displays the development of air traffic during the COVID-19 pandemic. Airports are represented by nodes and the tracked flights are denoted as links. Each link is associated with a weight, indicating the number of flights between two airports in a day.
MOOC is a bipartite interaction network of online sources, where nodes are students and course content units (e.g., videos and problem sets). Each link denotes a studentâ€™s access behavior to a specific content unit and is assigned with a 4-dimensional feature.
LastFM is bipartite and consists of the information about which songs were listened to by which users over one month. Users and songs are nodes, and links denote the listening behaviors of users.

1.2 D.2 Baselines

CTDNE [22] extends the static network embedding to dynamic graphs, where temporal random walks have been proposed with the skip-gram model to learn node representations.
DyRep [18] introduces a recurrent architecture to update node states during each interaction. It also includes a temporal attention aggregation module to consider the structural information evolving over time in dynamic graphs.
JODIE [20] uses two coupled recurrent neural networks to update the states of users and items. It introduces a projection operation to learn the future representation trajectories of each user/item.
TGAT [13] computes node representations by aggregating features from each node’s temporal-topological neighbors through a self-attention mechanism. It also features a time encoding function to capture temporal patterns.
TGN [14] maintains an evolving memory for each node, updating it when nodes are observed in interactions. This is achieved through message functions, a message aggregator, and a memory updater. An embedding module generates the temporal representation of nodes.
CAWN [15] extracts multiple causal anonymous walks for each node, exploring the causal relationships in the network dynamics and generating relative node identities. It then encodes each walk using a recurrent neural network and aggregates these walks to obtain the final node representation.
EdgeBank [37] is a purely memory-based method for transductive dynamic link prediction, with no trainable parameters. It stores observed interactions in memory cells and updates the memory through various strategies.
GraphMixer [17] integrates a fixed time encoding function into an MLP-Mixer-based link encoder to learn temporal link relationships.
NeurTWs [36] learns temporal node embeddings by combining contrastive learning and random walks with neighbor graphs. The focus is on optimizing node representations by contrasting positive and negative samples.

1.3 D.3 Implementation setails

Our code is available at STAW, where we provide detailed instructions for dataset preparation and model training. The searched ranges of hyperparameters and the related methods are shown in Table 8.

Table 8 Searched ranges of hyperparameters and the related methods

Full size table

E Time encoding

In our model, time is modeled using a series of cosine functions with different frequencies. We do not directly apply the traditional Fourier transform; instead, we encode the timestamps through a linear transformation to indirectly capture the frequency features in the time series. This process can be viewed as a “Fourier-like”encoding of the time series, with the goal of modeling the periodic characteristics of time using sine and cosine basis functions at different frequencies.

Specifically, $\Delta t$ is $t' - t$, the trainable parameter matrix $\omega$ represents different frequency scales. Each frequency scale corresponds to a specific time period, and during the forward pass, the timestamps are transformed linearly and then passed through the cosine function (cos) to generate the corresponding encoding. This process simulates the decomposition of the time series similar to a Fourier transform, but here we directly generate the frequency features using pre-defined frequency scales (initialized via $1 / 10^{\frac{9k}{time\_dim}}$ $for \ k=0,1,2,...,time\_dim-1$ ).

The main motivation for using Fourier transforms or similar approaches is to capture the periodic characteristics of the time series, especially when the data contains multiple frequency components. This method allows the model to extract and represent periodic patterns at different time scales, providing better generalization when dealing with long-term and complex sequential data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sheng, J., Zhang, Y. & Wang, B. Continuous-time dynamic graph learning based on spatio-temporal random walks. J Supercomput 81, 389 (2025). https://doi.org/10.1007/s11227-024-06881-5

Download citation

Accepted: 21 December 2024
Published: 11 January 2025
DOI: https://doi.org/10.1007/s11227-024-06881-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Continuous-time dynamic graph learning based on spatio-temporal random walks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

evolve2vec: Learning Network Representations Using Temporal Unfolding

A Comparative Study of Representation Learning Techniques for Dynamic Networks

Dynamic network link prediction based on random walking and time aggregation

Data availibility

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

A Notations

B Theoretical analysis of the combined sampling probabilities

1.1 B.1 Theoretical background

1.2 B.2 Theoretical analysis: necessity of averaging

1.3 B.3 Mathematical derivation

C Time complexity analysis

D Experimental setting

1.1 D.1 Dataset source

1.2 D.2 Baselines

1.3 D.3 Implementation setails

E Time encoding

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now