Modeling multi-regional temporal correlation with gated recurrent unit and multiple linear regression for urban traffic flow prediction

doi:10.1016/j.knosys.2022.110237

Knowledge-Based Systems

Volume 262, 28 February 2023, 110237

https://doi.org/10.1016/j.knosys.2022.110237 Get rights and content

Abstract

Urban traffic flow prediction has received much attention in the past few years, especially after the availability of huge traffic data. In addition, the efficacy of some existing traffic flow techniques heavily depends on some influential external factors (like weather, geographical information, point-of-interests (POIs), or road information). However, exploitable data containing such auxiliary information are extremely limited. One primary reason is that some rising cities lack the resources necessary for collecting such data. Therefore, it is difficult to pick an effective method that can accurately predict the traffic state when the data is limited. Due to that, this paper proposes a framework for achieving accurate predictions based on four data characteristics and the temporal correlation of neighbors without using additional influential factors. Specifically, we first propose a novel Multi-Region Correlation (MRC) method to make regions cooperate and share their traffic flow and patterns. Then, a deep recurrent network is used to capture the neighbors’ temporal correlation by integrating their traffic history that flow among regions and their neighbors. When analyzing the traffic data history, we model four data characteristics (seasonality, trend, residual, and cyclic). And lastly, we employ a Multiple Linear Regression Unit (MLRU) to predict the future traffic directly from the neighbors’ traffic. We conducted extensive experiments on two real-world datasets collected from two major cities in China, Chengdu and Xi’an. The results demonstrate that the proposed model can achieve superior performance compared to many existing models in terms of accuracy and efficiency.

Introduction

The rapid development of urbanization has led to the modernization of many metropolitans and deeply affects people’s life. In addition, that growth led to the emergence of huge data, like human mobility, geographic data, and traffic flow. But this development has also caused many issues that need proper management, such as pollution, energy consumption, and traffic congestion. Therefore, urban computing aims to provide a data-driven approach to these problems by leveraging the huge data generated in cities [1]. Urban computing has provided a wide range of applications in different areas such as transportation, environment, social, energy, and urban planning. As one of the major components, urban traffic flow prediction for the Intelligent Transportation Systems (ITS) has received enormous attention in the past decades. Effective traffic flow modeling can reduce traffic congestion or air pollution and helps decision makers achieve good plans for city management. Moreover, it provides early warnings for public safety emergency management. Nevertheless, it is very challenging to perform such a prediction because of the dynamic and complex traffic situations in large cities [1], [2], [3]. Recently, many models were done towards understanding and predicting traffic flow [4], [5], [6]. However, for different reasons such as limited data resources, models and algorithms that depend on external information are not perfect or suitable for some situations. Many previous works on traffic prediction use weather datasets and spatial datasets such as geographical topology, POIs, and road segment information besides historical traffic to predict future traffic flow. However, only a few public datasets publicly contain all of this information, and some newly emerging cities may not have the techniques to collect such data. Due to that, those models will not work for such situations.

In our study, we assume that cities lack the capability or the technology of collecting data from various sources that can be used in the traffic forecasting process. Hence, we aim at developing effective and efficient urban traffic flow forecasting models solely based on traffic data, without using external information. In particular, we focus on modeling the essential forecasting characteristics in time series analysis, i.e. seasonality, trend, residual, and periodicity, as well as modeling the strength of the temporal effect of the traffic flow among neighbors. Such crucial aspects had not been thoroughly examined in most traffic prediction studies. This study will demonstrate that accurate predictions can be achieved by integrating these predictors and neighbors’ temporal correlation.

To introduce our method at a high-level, we analyze the traffic history of neighboring regions and model their four characteristics by using an encoder–decoder model and use a Multiple Linear Regression Unit (MLRU) to aid the decoder in the prediction task. The learning manner behaves as follows: a deep recurrent neural network encodes the temporal correlation between regions and their neighbors; the MLRU is used to predict the future traffic directly from the neighbors’ traffic and feed the output to the decoder that uses attention-based techniques to let regions be aware of their neighbors’ influences. Therefore, our method is based on regions’ correlation and deep sequence learning of the four time series characteristics. This paper’s contributions can be summarized as follows:

•
We propose a model (MRC-MLRU) to predict urban traffic flow based on the correlation of regions and use a deep recurrent network to capture such correlations. This model uses the attention technique and thus can capture the inherent relationship among regions and their neighbors’ influence.
•
We employ MLRU to predict the future traffic directly from the neighbors’ traffic by converting the time series prediction problem to a multiple linear regression problem and combining the attention layers and MLRU outputs. As a result, the model can overcome the exposure bias problem and prevent the model’s hidden states from being updated by wrong sequence predictions when the model predictions are very bad at the early stages.
•
To capture the influence of the temporal correlation, we model four characteristics of traffic data: seasonality, trend, residual, and cyclic, which can model diverse temporal correlations to improve model performance.
•
We evaluate our model on two real-world datasets and find that the model can greatly enhance urban traffic prediction and gain better traffic-related knowledge from the neighbors’ traffic and the components’ attributes.
•
We release the datasets for predicting traffic flow derived from the ordered datasets of the ride requests collected by DiDi company. We have processed and reformulated it as a traffic forecasting dataset. The code and datasets have been released.¹

this paper is organized as follows. Section 1 contains the introduction,Section 2 presents the related work. Following that, preliminaries and problem definition are given in Section 3. Section 4 details the proposed method. The qualitative and quantitative results of different methods are presented in Section 5. The paper is concluded in Section 6 Finally, In Appendix, we introduce the released datasets and the preprocessing steps.

Section snippets

Related work

Machine learning theories have been widely used for time series forecasting in many areas like weather forecasting [7], financial market prediction [8], and the ITS [9], [10]. As traffic prediction is such an essential part of ITS, thus many works on prediction have been published. The following is an overview of literature related to traffic prediction. Ma et al. [11] use the Long–Short-Term Memory (LSTM) neural network [12] to predict traffic speed by capturing nonlinear and dynamic traffic

Overview

Here, we will introduce some preliminaries, notations, and problem formulation.

Methodology

In order to solve the traffic flow forecasting problem in the urban regions, we propose a model that captures the neighbors’ temporal correlation. Our model uses a seq2seq structure that includes an encoder for historical learning and decoder for making the future predict [14], and a multiple linear regression unit (MLRU). However, unlike the existing works, which only use the region’s traffic history to predict future traffic, we combine the region’s and its neighbors’ traffic history. This

Experiments

This section describes the datasets, the baseline methods, the parameter settings, and the implementation details; finally, we discuss the results of our model and compare them with the baseline methods.

We use two real-world datasets from two different cities in China: Chengdu and Xi’an cities to test the performance of the proposed model and the baseline models, as shown in Table 2. Both of the datasets are from the trajectory data of the DiDi drivers in Chengdu and Xi’an cities at the second

Conclusion

In this paper, we focused on a practical challenge in a traffic flow prediction scenario, which includes traffic prediction without external data. In order to achieve effective traffic flow prediction when the data is limited in terms of the availability of auxiliary information. We proposed a deep-learning-based approach for the traffic flow prediction problem, which incorporates the advantage of the region’s traffic history and the neighbors’ traffic. In addition, we integrate the traffic

CRediT authorship contribution statement

Taha M. Rajeh: Conceptualization, Methodology, Software, Investigation, Writing – original draft. Tianrui Li: Writing – review & editing, Funding acquisition, Supervision. Chongshou Li: Writing – review & editing. Muhammad Hafeez Javed: Software. Zhpeng Luo: Writing – review & editing. Fares Alhaek: Software.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This research was supported by the National Key R&D Program of China (2019YFB2101802) and the National Natural Science Foundation of China (Nos. 62176221, 62276215). Data source: DiDi Chuxing GAIA Open Datasets.

References (48)

ZhangX. et al.
Prediction of taxi destinations using a novel data embedding method and ensemble learning
IEEE Trans. Intell. Transp. Syst.
(2019)
MaX. et al.
Long short-term memory neural network for traffic speed prediction using remote microwave sensor data
Transp. Res. C
(2015)
YanH. et al.
Short-term traffic flow prediction based on a hybrid optimization algorithm
Appl. Math. Model.
(2022)
LuoC. et al.
MapReduce accelerated attribute reduction based on neighborhood entropy with Apache Spark
Expert Syst. Appl.
(2023)
ZhengY. et al.
Urban computing: concepts, methodologies, and applications
ACM Trans. Intell. Syst. Technol.
(2014)
ZhangJ. et al.
Data-driven intelligent transportation systems: A survey
IEEE Trans. Intell. Transp. Syst.
(2011)
ShuW. et al.
A short-term traffic flow prediction model based on an improved gate recurrent unit neural network
IEEE Trans. Intell. Transp. Syst.
(2021)
WangD. et al.
When will you arrive? estimating travel time based on deep neural networks
WangY. et al.
Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms
LiangY. et al.
Geoman: Multi-level attention networks for geo-sensory time series prediction

ChakrabortyP. et al.

Fine-grained photovoltaic output prediction using a bayesian ensemble

QinY. et al.

A dual-stage attention-based recurrent neural network for time series prediction

Di VaioM. et al.

Cooperative shock waves mitigation in mixed traffic flow environment

IEEE Trans. Intell. Transp. Syst.

(2019)

HochreiterS. et al.

Long short-term memory

Neural Comput.

(1997)

ChoK. et al.

Learning phrase representations using RNN encoder-decoder for statistical machine translation

(2014)

SutskeverI. et al.

Sequence to sequence learning with neural networks

BahdanauD. et al.

Neural machine translation by jointly learning to align and translate

(2014)

VaswaniA. et al.

Attention is all you need

VeličkovićP. et al.

Graph attention networks

KipfT.N. et al.

Semi-supervised classification with graph convolutional networks

HeJ. et al.

Improving traffic prediction with tweet semantics

LiuX. et al.

Collective traffic prediction with partially observed traffic history using location-based social media

LiaoB. et al.

Deep sequence learning with auxiliary information for traffic prediction

ZhangJ. et al.

Deep spatio-temporal residual networks for citywide crowd flows prediction

Cited by (17)

A novel partial grey prediction model based on traffic flow wave equation and its application
2024, Engineering Applications of Artificial Intelligence
Due to the spatiotemporal, periodic, and wave characteristics of traffic flow, this paper considers the continuous traffic flow on the road as a special kind of fluid, and uses the wave equation in fluid mechanics to describe the fluctuation and undulation characteristics of the traffic flow data. From the traffic flow wave equation, using the partial grey prediction model can effectively reflect the time correlation of traffic flow, and establish the second-order partial grey prediction model of the traffic flow wave equation. In solving the time response equation of the model, the modeling steps of the model were obtained by discretizing the grey differential equations and using iterative recursion. Finally, the validity of the model is verified by three case studies, in which the fitted mean absolute percentage error reaches a minimum of 5.1242%, which is better than the other three algorithms and two partial grey prediction models. Meanwhile, the new model was used to predict the short-term traffic flow for the three cases, and the situation when the traffic flow exhibits different periodicity is discussed separately, and the results show that the predicted data have great similarity with the original data trend. Therefore, the model proposed in this paper is very effective in solving the short-time traffic flow prediction problem. By using this model for real-time traffic flow prediction, the control of traffic signals can be optimized to support traffic planning decisions and enhance traffic safety and environmental protection.
Data-unbalanced traffic accident prediction via adaptive graph and self-supervised learning
2024, Applied Soft Computing
Traffic accident prediction is an important research problem, which can help to identify dangerous situations on the road in advance and take appropriate measures. Nonetheless, real-world traffic accident data suffers from a significant data unbalance problem, as accident occurrences vary unevenly in both spatial and data domains. This unbalance can easily lead to the prediction methods biased towards the side with more data. Recently, researchers have proposed a series of effective prediction methods based on deep learning and graph theory. Existing graph-based methods always adopt the predefined distance graph. However, these methods cannot fully capture the spatial correlations among regions that are far away from each other but share similar accident patterns. To address these challenges, we propose a traffic accident prediction method that combines Adaptive Graphs with Self-Supervised Learning (AGSSL). In the proposed method, we can adaptively construct graph structures to learn global spatial correlations among urban regions. Meanwhile, two self-supervised learning modules called Graph Infomax and Focal Contrastive Regularization are used to learn a robust representation of traffic accidents data under an unbalanced distribution. Experiment results show that AGSSL outperforms SOTA methods in traffic accident prediction.
Learning spatial patterns and temporal dependencies for traffic accident severity prediction: A deep learning approach
2024, Knowledge-Based Systems
Traffic accidents have a substantial impact on human life and property, resulting in millions of injuries every year. To ensure road safety and enhance the research in this direction, it is necessary to develop methods that can efficiently predict and classify the accident severity. However, traffic accident datasets may contain a large number of features, making it challenging to extract relevant information and patterns from high-dimensional data. Moreover, traffic accidents may be influenced by multiple factors and temporal dependencies, leading to a dynamic impact of each factor on accident severity over time. To address these challenges, we propose a novel deep-learning approach for predicting traffic accident severity. Specifically, we first conduct a thorough data preprocessing step to clean the data and ensure its quality. Then, a Convolutional Neural Network (CNN) is introduced to extract spatial features and patterns from the high-dimensional data, followed by a Bidirectional Long Short-Term Memory network (BiLSTM) to capture the temporal dependencies between various factors that affect traffic accidents. We also implement attention mechanisms to weigh the importance of each feature in the prediction, thereby reducing the impact of noisy or irrelevant data. To evaluate the effectiveness of our approach, we conduct experiments on a real-world traffic accident dataset from two cities. The results demonstrate the practicality and effectiveness of our framework for traffic accident severity prediction, with potential to enhance road safety.
Inductive and adaptive graph convolution networks equipped with constraint task for spatial–temporal traffic data kriging
2024, Knowledge-Based Systems
In intelligent transportation systems (ITS), deploying fine-grained sensors to continuously collect spatial–temporal traffic data is important but impractical due to the expensive cost. Fortunately, spatial–temporal kriging methods bring advanced solutions for interpolating traffic data for locations without sensors, but they still have the following two drawbacks: (1) The widely adopted predefined and adaptive graphs are either inflexible or limited to transductive learning. (2) The sampling strategies to support inductive learning on new graphs result in losing important partial information. To overcome the above issues, in this paper, we propose an Inductive and Adaptive Graph Convolution Networks (IAGCN) for spatial–temporal traffic data kriging in an inductive manner. Specifically, we propose an adaptive graph constructor to model the hidden spatial relation of nodes and learn the spatial dependency that the predefined graph structure cannot capture. It can work inductively and does not require retraining when introducing new nodes. Additionally, we design a framework that integrates inductive graph convolution and temporal convolution to simultaneously capture the complex spatial–temporal dependencies of traffic data. Finally, to address the information loss issue caused by random sampling, we design a predicted-based constraint task that perceives and utilizes the history information of all observed sensors to predict the current data, as well as approach the result of the interpolation and prediction. Experiments on four real-world datasets show that IAGCN outperforms the state-of-the-art baselines.
A generative adversarial network-based framework for network-wide travel time reliability prediction
2024, Knowledge-Based Systems
This paper introduces a generative model named the travel time reliability-generative adversarial network (TTR-GAN) model for predicting network-wide TTR using automatic vehicle identification data. The TTR-GAN model is capable of generating predicted travel time samples, enabling the assessment of network-wide TTR without the need to assume a specific travel time distribution. In the TTR-GAN model, a combination of graph convolutional networks and long short-term memory (LSTM) neural networks is employed within the GAN framework. When training the TTR-GAN model, special attention is given to adjusting the mean and standard deviation of the generated samples, aiming for a closer resemblance to real samples. Experiments conducted on a road network in China demonstrate the predictive capability of the proposed TTR-GAN model, surpassing several benchmark models such as the LSTM neural network, moving average model, and GAN model in terms of statistical, buffer time, and probability distribution measures. By incorporating the mean and standard deviation into the loss function, the TTR-GAN model achieves an 18.2% reduction in Jensen‒Shannon divergence between predicted and real samples. Furthermore, the model's performance in real-world applications is illustrated through a sensitivity test.
Fuzzy linear regression based on a hybrid of fuzzy C-means and the fuzzy inference system for predicting serum iron levels in patients with chronic kidney disease
2023, Expert Systems with Applications
Multiple regression has been proven to be a reliable method for solving prediction problems that require many independent variables. The biggest challenge in multiple regression is that the number of samples given must be sufficient based on the number of independent variables. In some cases, it may not be possible to eliminate the independent variables via feature selection. This study aims to build a multiple linear regression problem-solving model using Sugeno's fuzzy inference system (FIS) approach. The main contribution of this study is to provide an alternative model for performing linear regression, which has quite a lot of independent variables but not too many datasets. The case that was resolved was the prediction of serum iron (SI) based on nine independent variables of hematology measurement results. The proposed FIS model uses Fuzzy C-Means (FCM) clustering to produce fuzzy sets and fuzzy rules. The Gauss membership function is used as the membership function in each fuzzy set. The output of the fuzzy rules is a linear equation based on the Sugeno Order 1 fuzzy inference method. To test the performance of the model, a comparison of the mean square error (MSE) output of the system with classical multiple linear regression (CMLR) and backpropagation neural networks (BNNs) was performed. In nine scenarios, comparisons were made. The results showed that FIS had the best performance in each scenario. In training data, FIS had the best performance with an MSE of 0.0148, followed by CMLR and BNN with MSEs of 0.0180 and 0.0285, respectively. Whereas in data testing, FIS had the best performance with an MSE of 0.0239, followed by BNN and CMLR with MSEs of 0.0246 and 0.0255, respectively. Even in scenarios where there is little training data, FIS still shows good performance, with MSEs of 0.0219 and 0.0402 on training and testing data, respectively.

View all citing articles on Scopus

View full text

Modeling multi-regional temporal correlation with gated recurrent unit and multiple linear regression for urban traffic flow prediction

Abstract

Introduction

Section snippets

Related work

Overview

Methodology

Experiments

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

IEEE Trans. Intell. Transp. Syst.

Transp. Res. C

Appl. Math. Model.

Expert Syst. Appl.

Urban computing: concepts, methodologies, and applications

ACM Trans. Intell. Syst. Technol.

Data-driven intelligent transportation systems: A survey

IEEE Trans. Intell. Transp. Syst.

A short-term traffic flow prediction model based on an improved gate recurrent unit neural network

IEEE Trans. Intell. Transp. Syst.

When will you arrive? estimating travel time based on deep neural networks

Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms

Geoman: Multi-level attention networks for geo-sensory time series prediction

Fine-grained photovoltaic output prediction using a bayesian ensemble

A dual-stage attention-based recurrent neural network for time series prediction

Cooperative shock waves mitigation in mixed traffic flow environment

IEEE Trans. Intell. Transp. Syst.

Long short-term memory

Neural Comput.

Learning phrase representations using RNN encoder-decoder for statistical machine translation

Sequence to sequence learning with neural networks

Neural machine translation by jointly learning to align and translate

Attention is all you need

Graph attention networks

Semi-supervised classification with graph convolutional networks

Improving traffic prediction with tweet semantics

Collective traffic prediction with partially observed traffic history using location-based social media

Deep sequence learning with auxiliary information for traffic prediction

Deep spatio-temporal residual networks for citywide crowd flows prediction