Exploring spatial–temporal relations via deep convolutional neural networks for traffic flow prediction with incomplete data
Introduction
The task of traffic flow prediction is to estimate the averaged number of vehicles in a specific region and a future time interval given the historical flow data from the global traffic network, which is an essential component of Advanced Traffic Management Systems (ATMSs). Traffic flow prediction is a crucial task for the intelligent traffic management and control, which has become one of the major research fields in Intelligent Transportation Systems (ITSs) [1], [2], [3], [4]. Under the rapid development of advanced traffic perceptual infrastructures and information management systems, it becomes possible and convenience to monitor the realtime traffic condition and collect a large volume of historical traffic data. In the past, we have paid more attention on the collection and storage management of historical data. However, accurate traffic flow forecasting for future time would bring lots of benefits for traveler route planning and traffic management scheduling. It is necessary to analyze the historical traffic data and to construct data-driven traffic condition prediction systems for citizens’ traveling and government agencies’ management. In fact, with the increasing collection of types of traffic data, it is believed that the volume of data is big enough to be used to reveal the intrinsic statistical information and potential relations hidden in the data [5], [6], [7].
Traffic flow prediction is a typical regression problem performed on time series from a traffic network, of which the research field has a life of nearly 40 years [8]. To address the problem, data-driven statistical techniques are employed to explore the intrinsic relations in the collected data. Generally, the statistical techniques used for traffic flow prediction can be categorized into the two types: parametric and nonparametric methods. The early methods belong to the former group, which characterize the collected traffic data as a time series, and estimate future traffic conditions of the given location at the specific time slice by exploring the relations hidden in the historical time-series data based on parametric models [9], [10], [11], [12]. In particular, the linear models are constructed along the temporal dimension. One of the earliest classic methods is Autoregressive-moving Average (ARMA) model [10] which is also a basic statistical method for a general time-series analysis problem. Then the ARMA method has been extended as the Autoregressive Integrated Moving Average (ARIMA) model [12] by adding an extra procedure of difference to find the relations of the temporal variation of traffic flow and then predict the future traffic flow with that information. Since the traffic condition follows strong periodic patterns due to the travelers’ living and working rhythm, it is natural to extend the ARIMA model to the seasonal ARIMA (SARIMA) [9], [13], [14], [15], [16]. Many studies find that the SARIMA model performs better than other ARIMA based models, even than some nonparametric models in some cases [13], [14], [15]. However, it is reported that the application of SARIMA model mainly suffers from the requirement of huge data. In fact, it is not easy to collect traffic data under some circumstances. Later, except for linear parametric models, many nonlinear parametric models have been employed for traffic flow prediction [17].
On the other hand, nonparametric methods with more complicated structures are believed to have a stronger ability to capture the embedded nonlinear characteristics of short-term traffic flow data. A large group of nonparametric models have been explored for traffic flow prediction, which could be classified into two classes: probabilistic graph models and machine learning based models. The graph models introduced in this field include Gaussian process model [18], Bayesian network [19], Markov chain [20], and Markov Random Fields (MRFs) [21]. The connection structure of graph models is shown to be able to represent some nonlinear dependencies. However, this type of methods suffer from the complexity of the inference problem for real-time application. Recently, machine learning based methods attract more research attentions, such as artificial neural networks [22], [23], [24], [21], [25], [26], [27], Support Vector Regression (SVR) [3] and Local Weight Learning (LWL) [28]. In particular, since deep learning shows a significantly better performance in various fields, for example, image classification, speech recognition and natural language processing, and further the traffic flow data are represented as a time series which is similar to the natural language data, many recent works employ deep learning models into the application of traffic flow prediction.
In general, as we know machine learning based nonparametric methods are driven by empirical data. To address the traffic flow prediction problem, from the view point of learning strategy design, we need to consider the following three issues, such as feature representation of traffic patterns, learning from single location or network, and data quality. Like most of other pattern recognition applications, the first problem is how to design the possible optimal representation [29], [30], [31]. Many of existing methods directly used the temporal sequence as a feature vector for simplicity. Others adopt either the sophisticated hand-designed features or the model-based features. However, the domain knowledge of traffic engineers is limited, and they cannot directly explore relations hidden in big datasets. So, it is really difficult for traffic engineers to find the most appropriate hand-designed feature representation for their adopted models. So, the model-based features, for example, deep network representation, are believed to be promising.
The following considered issue is learning from single location or network, which is regarded as one of challenges in traffic flow prediction by [32]. Many existing methods usually construct the prediction models with spatial and temporal dimensions separately. The ARIMA model, for example, predicts the future traffic flow of a road only based on the observed data in the previous time intervals from the road. The historical data from other connected roads are neglected totally. However, it is obvious that the traffic flow is affected by the traffic connected network [33]. It is better to do traffic flow prediction by exploring spatio-temporal relations over the traffic network. As pointed by [32], spatio-temporal jointly analysis is one of the most important obstacles faced in traffic flow prediction. As early as in 1980, a three-stage iterative procedure was designed for space–time modeling [34]. The more commonly-used method is to compute the spatio-temporal autocorrelation to capture spatio-temporal joint relations [35]. In [36], a non-parametric spatio-temporal kernel regression model is developed for prediction task in spatio-temporal series. In order to explore temporal relations, some researchers introduced Long Short-Term Memory (LSTM) into the field of traffic flow prediction. A natural extension is to design a LSTM network to address the spatio-temporal joint problem [37]. The spatial relations considered by all these existing methods are based on the geometric locations of the traffic nodes. In some cases, we have no geometric information from the traffic dataset. However, the spatial relations between the traffic nodes exist indeed. Different from the existing work, in this paper, we try to capture the spatial relations without geometric information.
Finally, few existing researches consider the incomplete data problem which is in fact very severe for practical application [36]. In this work, we employ ensemble learning based on random subspace learning to improve the deep CNN to tolerate incomplete data. The work of this paper is motivated by the above considered three issues. In this paper, we attempt to define a deep architecture for traffic flow prediction that learns deep hierarchical feature representation with spatio-temporal relations over the traffic network. Furthermore, we consider to make the model be able to tolerate incomplete data. The contributions of this work are summarized as the following three points.
- •
We transform the time series analysis problem into the task of image-like analysis. Benefitting from the image-like data form, we can jointly explore spatial and temporal relations simultaneously by the two-dimension convolution operator.
- •
The proposed model can tolerate the incomplete data, which is very common in traffic application field.
- •
We propose an improved random search based on uniform design in order to optimize hyper-parameters for deep CNN.
The rest of this paper is organized as follows. Section 2 gives the basic formulation and introduces corresponding theoretical background. The detail of our proposed method is explained in Section 3. The empirical study is performed in Section 4 and the experimental results are reported and discussed in this section. The final section summarizes the work and suggests some future directions.
Section snippets
Basic formulation and theoretical background
In this section, we summarize the basic formulation for the considered problem and explain the related theoretical background briefly. The task of traffic flow prediction is to estimate the accumulated number of vehicles in a specific region during a future time interval by using the historical flow data from the global traffic network. Formally, given the historical traffic flow data from the traffic network, where denotes the accumulated flow covering the th
The proposed method
After data preparing we need to consider the model construction. In this section, we present a random subspace learning based deep CNN (RSCNN) for traffic flow prediction. As shown in Fig. 2, our proposed model consists of three parts: random subspace construction, ensemble learning on deep convolutional neural networks and result integration.
Data description
The data used for our experiments are taken from the California Freeway Performance Measurement System (PeMS) [49]. The PeMS dataset is a well-known public benchmark for traffic flow prediction. The original dataset is very large, which covers the freeway traffic data of the nine districts in California for several years. It includes the three types of traffic conditions: raw vehicle count (traffic flow), occupancy (amount of time the loop detector is active) and vehicle speed. Here, we only
Conclusion and future work
The improvements of the short-term traffic flow prediction performance from deep learning bring us the twofold indications. Firstly, we demonstrate that deep learning is useful in this application field and provides a significant improvement over shallow models. In addition, it also suggests us explore other types of deep neural networks, for example, Deep Belief Network (DBN), AutoEncoder, Recurrent Neural Network (RNN) [58], and Long Short Term Memory (LSTM) [59]. The DBN and AutoEncoder are
Acknowledgments
This work is supported by the Fundamental Research Funds for the Central Universities (106112017CDJXY090002, CDJXS10182216), the Science and Technology Research Program of Chongqing Municipal Education Commission (KJQN201800120), the Subproject VI (2018YFF0214706) of National Key Research and Development Program Project (2018YFF0214700), National Natural Science Foundations of China (61672119, 61379158, 61672117), National Program on Key Basic Research Project (973 Program) (2013CB328903), the
References (63)
- et al.
Online-SVR for short-term traffic flow prediction under typical and atypical traffic conditions
Expert Syst. Appl.
(2009) - et al.
Swarm intelligence algorithms for macroscopic traffic flow model validation with automatic assignment of fundamental diagrams
Appl. Soft Comput.
(2016) Deep learning in neural networks: An overview
Neural Netw.
(2015)- et al.
Combining Kohonen maps with ARIMA time series models to forecast traffic flow
Transp. Res. C. Emerg. Technol.
(1996) - et al.
What is the best way for extracting meaningful attributes from pictures?
Pattern Recognit.
(2017) - et al.
Short-term traffic forecasting: where we are and where were going
Transp. Res. Part C: Emerg. Technol.
(2014) - et al.
LSTM network: a deep learning approach for short-term traffic forecast
IET Intell. Transp. Syst.
(2017) - et al.
Model selection for support vector machines via uniform design
Comput. Statist. Data Anal.
(2007) - et al.
Comparison of parametric and nonparametric models for traffic flow forecasting
Transp. Res. C, Emerg. Technol.
(2002) - et al.
Data-driven intelligent transportation systems: A survey
IEEE Trans. Intell. Transp. Syst.
(2011)
The evolution of urban traffic control: Changing policy and technology
Transp. Plann. Technol.
Data mining with big data
IEEE Trans. Knowl. Data Eng.
Fuzzy rule-based system approach to combining traffic count forecasts
Transp. Res. Rec. J. Transp. Res. Board
Analysis of freeway traffic time-series data by using Box-Jenkins techniques
Transp. Res. Rec.
Bayesian time-series model for short-term traffic flow forecasting
J. Transp. Eng.
Short term traffic forecasting using time series methods
Transp. Plann. Technol.
Predictions of urban volumes in single time series
IEEE Trans. Intell. Transp. Syst.
Short-term traffic flow forecasting: an experimental comparison of time-series analysis and supervised learning
IEEE Trans. Intell. Transp. Syst.
Urban traffic flow prediction: application of seasonal autoreressive integrated moving average and exponential smoothing models
Transp. Res. Rec. J. Transp. Res. Board
Variational inference for infinite mixtures of Gaussian processes with applications to traffic flow prediction
IEEE Trans. Intell. Transp. Syst.
A Bayesian network approach to traffic flow forecasting
IEEE Trans. Intell. Transp. Syst.
Neural-network-based models for short-term traffic flow forecasting using a hybrid exponential smoothing and LevenbergCMarquardt algorithm
IEEE Trans. Intell. Transp. Syst.
Selection of significant on-road sensor data for short-term traffic flow forecasting using the Taguchi method
IEEE Trans. Ind. Inf.
Risk-averse energy trading in multienergy microgrids: a two-stage stochastic game approach
IEEE Trans. Ind. Inform.
Efficient computation for sparse load shifting in demand side management
IEEE Trans. Smart Grid
Traffic flow forecasting: Comparison of modeling approaches
J. Transp. Eng.
Prediction of short-term traffic variables using intelligent swarm-based neural networks
IEEE Trans. Control Syst. Technol.
Cited by (59)
MS-LSTM: Exploring spatiotemporal multiscale representations in video prediction domain
2023, Applied Soft ComputingMultivariate machine learning-based prediction models of freeway traffic flow under non-recurrent events
2023, Alexandria Engineering JournalCitation Excerpt :Due to this, researchers have been adapting unsupervised methods with a hybrid approach instead of a deep learning architect [34,38]. The particular correlation was developed by applying the Deep CNN model to random subspace learning [28]. Complex computations and multi-dimensional inputs can be treated very easily with typical DNN.
Multi-precision traffic speed predictions via modified sequence to sequence model and spatial dependency evaluation method
2022, Applied Soft ComputingCitation Excerpt :Liu et al. [26] regarded the traffic state in cities as pixels of images and built an ensemble prediction model based on CNN for the partitioned grid. Similar works that transformed time-series analysis problems into image-series analysis could be found in these studies [12,14,27]. Moreover, Zhang et al. [13] established a deep spatial–temporal residual network by using the CNN framework to capture the spatial dependence of sub-regions that are generated by grid-based map segmentation and then applied the residual unit [28] to seize the information of different layers.