Elsevier

Information Sciences

Volume 607, August 2022, Pages 869-883
Information Sciences

Attention based spatiotemporal graph attention networks for traffic flow forecasting

https://doi.org/10.1016/j.ins.2022.05.127Get rights and content

Abstract

Traffic flow forecasting is a crucial task in transportation and necessary for congestion mitigation, traffic control, and intelligent traffic management. Deep learning models can aid in high-accuracy traffic flow forecasting; however, the current research focuses only the ability of the model to capture dynamic spatiotemporal features, and studies on the effect of deeper network layers on spatiotemporal features—a critical factor affecting traffic flow forecasting accuracy—are limited. In this paper, we propose an attention-based spatiotemporal graph attention network (ASTGAT) model designed for network degradation and over-smoothing problems to investigate in-depth spatiotemporal information. Compared to other networks, ASTGAT can capture dynamic spatiotemporal correlations in data and deepen the network to improve prediction accuracy through multiple residual convolution and high-low feature concat. ASTGAT comprises three components that separately model the temporal relationships of the recent, daily, and weekly periods. Each component stacks multiple spatiotemporal blocks constructed using the attention mechanism, dilated gated convolution, and graph attention network. The graph and temporal attention layers capture spatiotemporal information dynamically, and the graph attention layer alleviates the over-smoothing phenomenon to deepen the network. The combined utilization of the attention mechanism and dilated gated convolution layer improves the medium and long temporal span prediction ability. We validated ASTGAT using two open highway datasets, and the results demonstrated that our ASTGAT model effectively extracts in-depth spatiotemporal information and the prediction results outperform those predicted by the current eight baselines. Our research is dedicated to establishing a better scientific basis for intelligent traffic management that can assist in decision making.

Introduction

Intelligent transportation systems (ITS) that can predict people’s travel and life requirements intelligently have gained considerable interest in academic and business fields [1]. Traffic forecasting is a popular research topic in this domain, and it helps mitigate traffic congestion, prevent traffic accidents, and effectively manage intelligent traffic infrastructure [2]. However, a major challenge to traffic forecasting is the inherent nonlinearity and the complex spatiotemporal correlation of traffic flow data influenced by both temporal and spatial correlations [3]. Temporal correlations are influenced by traffic conditions at the previous moment or even longer and spatial correlations refer to an interactive dynamic influence; that is, the traffic condition affects the flow up the road segment [[3], [4]]. Traffic flow prediction at a certain location and time is extremely difficult and the changing traffic volume leads to a change in correlation, which further increases the difficulty in achieving accurate predictions. Thus, traffic flow prediction based on spatiotemporal correlation has become a popular research topic in ITS studies.

Technological developments have led to an increase in the number of studies being conducted on flow forecasting. Initially, researchers focused on the temporal correlation of traffic flow data using the historical average (HA) [5], autoregressive integrated moving average (ARIMA) [6], and vector autoregressive (VAR) models [7]. In these studies, traffic flow is predicted under the temporal variation pattern; however, it is difficult to achieve remarkable improvements in the model to obtain a more accurate prediction. With the rapid development of deep learning, long short-term memory network (LSTM) [8] and gated recurrent unit (GRU) [9] become the mainstream prediction models for solving problems with complex assumptions and efficiencies; however, these models focused only on temporal correlations and the results were not very satisfactory. Thereafter, spatial correlation was considered with temporal correlation. Wu and Zhang used convolutional neural networks (CNN) to extract the spatial features of traffic data; however, their results were not outstanding for non-Euclidean data [10]. With the development of graph neural networks (GNNs) [[11], [12], [13]], spatiotemporal GNNs became the mainstream method for traffic prediction [14]. As the network continues to improve, its models become better at capturing spatiotemporal correlations and are being used in several other fields (e.g., PM2.5 [15], crime case prediction [16], and bike-sharing prediction [17]). However, there is still room for improvement in terms of the quality of the captured spatiotemporal information.

Although many existing networks consider spatiotemporal features, it remains difficult to further deepen the network. Spatiotemporal graph neural networks (STGNNs) usually utilize temporal feature extraction (such as recurrent neural network (RNN) or CNN) and spatial feature extraction (such as graph convolutional networks (GCNs) [12]. There are two issues that need to be considered for deepening a STGNN: (1) network degradation and gradient disappearance caused by the RNN and CNN deepening process, and (2) the over-smoothing problem, i.e., features that have the same trend in GCN deepening [18]. The coupled effect of these two problems considerably limits the deepening ability of the STGNNs and the prediction capability. Therefore, an increasing number of novel frameworks have been selected to extract spatiotemporal features from fewer layers, which helps them to avoid the deepening work of networks. However, these frameworks are limited to further enhance forecasting capabilities.

An attention-based spatiotemporal graph attention network (ASTGAT) was proposed to forecast traffic flow at each location of the traffic network to solve these problems. The first “attention” in ASTGAT refers to the temporal attention layer and the second one refers to the graph attention layer. The network can work directly on graph-structured datasets and efficiently extract spatiotemporal information.

The main contributions of this study are as follows.

  • A novel framework called ASTGAT that enriches spatiotemporal features by stacking spatiotemporal blocks with different levels so as to deepen the network effectively was proposed.

  • A novel spatiotemporal block model and a multicomponent were designed to mitigate the rapid spatiotemporal changes. The former dynamically captures spatiotemporal correlations and the latter focuses on time patterns of different periods.

  • A comparison was performed with other baseline methods in two public datasets. The results indicate that the prediction error was reduced by up to 7 % compared to the latest baseline methods. Further, the proposed model mitigates the over-smoothing problem that commonly occurs with STGNNs. The results demonstrated that the upper limit on the number of spatiotemporal blocks that can be effectively stacked in our model exceeds that of other STGNNs.

The remainder of this article is organized as follows: Section 2 presents related work on traffic prediction using the STGNN approach. In Section 3, the novel framework is described, and the design principle and detailed model are introduced. Section 4 details the experimental design, baseline experiments, and results. Section 5 provides discussions on the results. Finally, Section 6 summarizes our work.

Section snippets

Traffic flow forecasting

Traffic flow forecasting is one of the most challenging difficulties in ITS, and many academic and business studies have been conducted in this domain to solve this challenge. Traffic flow forecasting methods are divided into dynamic modeling and data mining methods. Dynamic modeling approaches use mathematical tools and physical models to simulate traffic dynamics for prediction [19]. These models require complex mathematical formulations and theoretical assumptions, and the traffic flow data

Preliminaries

In this study, G(V,E,A) denotes the traffic network, where VRN represents the set of vertices, i.e., the number of sensor sites in the traffic network; ERN×N represents the set of edges, i.e., the links between sensor sites in the traffic network; and ARN×N represents the adjacency matrix.

The graph signal matrix is XGRN×F, which represents the F eigenvalues detected at each node on the traffic network G at the time T. Suppose we have i historical time intervals, and each time T has a

Experiment design

Two open traffic datasets were selected for traffic flow forecasting for evaluating the performance of the proposed model.

Results

We applied the network to two datasets (PeMSD4 and PeMSD8) for validation and metric evaluation using RMSE and MAE. As indicated in Table 2, our model is more effective than the results of all other reference methods on average. Compared with the best baseline method, our method reduced RMSE by approximately 7.4 % and MAE by approximately 5.8 % on PeMSD4; and RMSE by 1.8 % and MAE by about 1.5 % on PeMSD8. These results reveals that our ASTGAT model is outstanding for traffic flow forecasting.

Conclusions

A new deepened spatiotemporal graph neural network model (ASTGAT) was proposed and used for traffic flow prediction. This model uses a graph attention layer and a temporal attention layer to solve the problem of dynamic spatiotemporal information capture. Double residual convolution and high-low feature concat were developed to solve network degradation and over-smoothing problems when deepening the network. Thus, our network can further deepen the network based on dynamic spatiotemporal

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work was supported by the Beijing Natural Science Foundation (8222009), the Training Program for Talents by Xicheng, Beijing (202137), and the Pyramid Talent Training Project of the Beijing University of Civil Engineering and Architecture (JDJQ20200306).

References (47)

  • L. Zhao et al.

    T-gcn: A temporal graph convolutional network for traffic prediction

    IEEE Trans. Intell. Transp. Syst.

    (2020)
  • J. Liu et al.

    A summary of traffic flow forecasting methods

    J. Highway Transport. Res. Dev.

    (2004)
  • B.M. Williams et al.

    Modeling and forecasting vehicular traffic flow as a seasonal ARIMA process: Theoretical basis and empirical results

    J. Transp. Eng.

    (2003)
  • E. Zivot, J. Wang, Vector autoregressive models for multivariate time series, Modeling financial time series with...
  • S. Hochreiter et al.

    Long short-term memory

    Neural Comput.

    (1997)
  • J. Chung, C. Gulcehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling,...
  • J. Zhang et al.

    Deep spatio-temporal residual networks for citywide crowd flows prediction, in

  • J. Bruna, W. Zaremba, A. Szlam, Y. LeCun, Spectral networks and locally connected networks on graphs, in: 2nd...
  • T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in: 5th International...
  • P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, Y. Bengio, Graph Attention Networks, in: International...
  • J. Ye et al.

    How to build a graph-based deep learning architecture in traffic domain: A survey

    IEEE Trans. Intell. Transp. Syst.

    (2020)
  • Q. Li et al.

    Deeper insights into graph convolutional networks for semi-supervised learning, in: proceedings of the thirty-second AAAI conference on artificial intelligence (AAAI-18)

  • M.S. Ahmed, A.R. Cook, Analysis of freeway traffic time-series data by using Box-Jenkins techniques,...
  • Cited by (46)

    View all citing articles on Scopus
    View full text