Attention based spatiotemporal graph attention networks for traffic flow forecasting
Introduction
Intelligent transportation systems (ITS) that can predict people’s travel and life requirements intelligently have gained considerable interest in academic and business fields [1]. Traffic forecasting is a popular research topic in this domain, and it helps mitigate traffic congestion, prevent traffic accidents, and effectively manage intelligent traffic infrastructure [2]. However, a major challenge to traffic forecasting is the inherent nonlinearity and the complex spatiotemporal correlation of traffic flow data influenced by both temporal and spatial correlations [3]. Temporal correlations are influenced by traffic conditions at the previous moment or even longer and spatial correlations refer to an interactive dynamic influence; that is, the traffic condition affects the flow up the road segment [[3], [4]]. Traffic flow prediction at a certain location and time is extremely difficult and the changing traffic volume leads to a change in correlation, which further increases the difficulty in achieving accurate predictions. Thus, traffic flow prediction based on spatiotemporal correlation has become a popular research topic in ITS studies.
Technological developments have led to an increase in the number of studies being conducted on flow forecasting. Initially, researchers focused on the temporal correlation of traffic flow data using the historical average (HA) [5], autoregressive integrated moving average (ARIMA) [6], and vector autoregressive (VAR) models [7]. In these studies, traffic flow is predicted under the temporal variation pattern; however, it is difficult to achieve remarkable improvements in the model to obtain a more accurate prediction. With the rapid development of deep learning, long short-term memory network (LSTM) [8] and gated recurrent unit (GRU) [9] become the mainstream prediction models for solving problems with complex assumptions and efficiencies; however, these models focused only on temporal correlations and the results were not very satisfactory. Thereafter, spatial correlation was considered with temporal correlation. Wu and Zhang used convolutional neural networks (CNN) to extract the spatial features of traffic data; however, their results were not outstanding for non-Euclidean data [10]. With the development of graph neural networks (GNNs) [[11], [12], [13]], spatiotemporal GNNs became the mainstream method for traffic prediction [14]. As the network continues to improve, its models become better at capturing spatiotemporal correlations and are being used in several other fields (e.g., PM2.5 [15], crime case prediction [16], and bike-sharing prediction [17]). However, there is still room for improvement in terms of the quality of the captured spatiotemporal information.
Although many existing networks consider spatiotemporal features, it remains difficult to further deepen the network. Spatiotemporal graph neural networks (STGNNs) usually utilize temporal feature extraction (such as recurrent neural network (RNN) or CNN) and spatial feature extraction (such as graph convolutional networks (GCNs) [12]. There are two issues that need to be considered for deepening a STGNN: (1) network degradation and gradient disappearance caused by the RNN and CNN deepening process, and (2) the over-smoothing problem, i.e., features that have the same trend in GCN deepening [18]. The coupled effect of these two problems considerably limits the deepening ability of the STGNNs and the prediction capability. Therefore, an increasing number of novel frameworks have been selected to extract spatiotemporal features from fewer layers, which helps them to avoid the deepening work of networks. However, these frameworks are limited to further enhance forecasting capabilities.
An attention-based spatiotemporal graph attention network (ASTGAT) was proposed to forecast traffic flow at each location of the traffic network to solve these problems. The first “attention” in ASTGAT refers to the temporal attention layer and the second one refers to the graph attention layer. The network can work directly on graph-structured datasets and efficiently extract spatiotemporal information.
The main contributions of this study are as follows.
- •
A novel framework called ASTGAT that enriches spatiotemporal features by stacking spatiotemporal blocks with different levels so as to deepen the network effectively was proposed.
- •
A novel spatiotemporal block model and a multicomponent were designed to mitigate the rapid spatiotemporal changes. The former dynamically captures spatiotemporal correlations and the latter focuses on time patterns of different periods.
- •
A comparison was performed with other baseline methods in two public datasets. The results indicate that the prediction error was reduced by up to 7 % compared to the latest baseline methods. Further, the proposed model mitigates the over-smoothing problem that commonly occurs with STGNNs. The results demonstrated that the upper limit on the number of spatiotemporal blocks that can be effectively stacked in our model exceeds that of other STGNNs.
The remainder of this article is organized as follows: Section 2 presents related work on traffic prediction using the STGNN approach. In Section 3, the novel framework is described, and the design principle and detailed model are introduced. Section 4 details the experimental design, baseline experiments, and results. Section 5 provides discussions on the results. Finally, Section 6 summarizes our work.
Section snippets
Traffic flow forecasting
Traffic flow forecasting is one of the most challenging difficulties in ITS, and many academic and business studies have been conducted in this domain to solve this challenge. Traffic flow forecasting methods are divided into dynamic modeling and data mining methods. Dynamic modeling approaches use mathematical tools and physical models to simulate traffic dynamics for prediction [19]. These models require complex mathematical formulations and theoretical assumptions, and the traffic flow data
Preliminaries
In this study, denotes the traffic network, where represents the set of vertices, i.e., the number of sensor sites in the traffic network; represents the set of edges, i.e., the links between sensor sites in the traffic network; and represents the adjacency matrix.
The graph signal matrix is , which represents the eigenvalues detected at each node on the traffic network at the time . Suppose we have historical time intervals, and each time T has a
Experiment design
Two open traffic datasets were selected for traffic flow forecasting for evaluating the performance of the proposed model.
Results
We applied the network to two datasets (PeMSD4 and PeMSD8) for validation and metric evaluation using RMSE and MAE. As indicated in Table 2, our model is more effective than the results of all other reference methods on average. Compared with the best baseline method, our method reduced RMSE by approximately 7.4 % and MAE by approximately 5.8 % on PeMSD4; and RMSE by 1.8 % and MAE by about 1.5 % on PeMSD8. These results reveals that our ASTGAT model is outstanding for traffic flow forecasting.
Conclusions
A new deepened spatiotemporal graph neural network model (ASTGAT) was proposed and used for traffic flow prediction. This model uses a graph attention layer and a temporal attention layer to solve the problem of dynamic spatiotemporal information capture. Double residual convolution and high-low feature concat were developed to solve network degradation and over-smoothing problems when deepening the network. Thus, our network can further deepen the network based on dynamic spatiotemporal
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
This work was supported by the Beijing Natural Science Foundation (8222009), the Training Program for Talents by Xicheng, Beijing (202137), and the Pyramid Talent Training Project of the Beijing University of Civil Engineering and Architecture (JDJQ20200306).
References (47)
- et al.
Forecasting PM2. 5 using hybrid graph convolution-based model considering dynamic wind-field to offer the benefit of spatial interpretability
Environ. Pollut.
(2021) - et al.
Graph deep learning model for network-based predictive hotspot mapping of sparse spatio-temporal events
Comput. Environ.
(2020) - et al.
TAGCN: Station-level demand prediction for bike-sharing system via a temporal attention graph convolution network
Inf. Sci.
(2021) - et al.
Analysis of subway station capacity with the use of queueing theory
Transportation research part C: emerging technologies
(2014) - et al.
Dynamic prediction of traffic volume through Kalman filtering theory
Transportation Research Part B: Method.
(1984) - et al.
A dynamical spatial-temporal graph neural network for traffic demand prediction
Inf. Sci.
(2022) - et al.
Dynamic graph convolutional network for long-term traffic flow prediction with reinforcement learning
Inf. Sci.
(2021) - et al.
A review of computer vision techniques for the analysis of urban traffic
IEEE Trans. Intell. Transp. Syst.
(2011) - et al.
Data-driven intelligent transportation systems: A survey
IEEE Trans. Intell. Transp. Syst.
(2011) - et al.
Spatial-temporal synchronous graph convolutional networks: a new framework for spatial-temporal network data forecasting
T-gcn: A temporal graph convolutional network for traffic prediction
IEEE Trans. Intell. Transp. Syst.
A summary of traffic flow forecasting methods
J. Highway Transport. Res. Dev.
Modeling and forecasting vehicular traffic flow as a seasonal ARIMA process: Theoretical basis and empirical results
J. Transp. Eng.
Long short-term memory
Neural Comput.
Deep spatio-temporal residual networks for citywide crowd flows prediction, in
How to build a graph-based deep learning architecture in traffic domain: A survey
IEEE Trans. Intell. Transp. Syst.
Deeper insights into graph convolutional networks for semi-supervised learning, in: proceedings of the thirty-second AAAI conference on artificial intelligence (AAAI-18)
Cited by (46)
Dynamic spatial–temporal graph convolutional recurrent networks for traffic flow forecasting
2024, Expert Systems with ApplicationsDynamic multi-granularity spatial-temporal graph attention network for traffic forecasting
2024, Information SciencesADCT-Net: Adaptive traffic forecasting neural network via dual-graphic cross-fused transformer
2024, Information Fusion