Elsevier

Applied Soft Computing

Volume 78, May 2019, Pages 712-721
Applied Soft Computing

Exploring spatial–temporal relations via deep convolutional neural networks for traffic flow prediction with incomplete data

https://doi.org/10.1016/j.asoc.2018.09.040Get rights and content

Highlights

  • We transform the time series analysis problem into the task of image-like analysis.

  • We can jointly explore spatio-temporal relations by the convolution operator.

  • The proposed model can tolerate the incomplete data.

  • We propose an random search based on uniform design to optimize hyper-parameters for deep CNN.

Abstract

Traffic flow prediction is a fundamental component in intelligent transportation systems. Various computational methods have been applied in this field, among which machine learning based methods are believed to be promising and scalable for big data. In general, most of machine learning based methods encounter three fundamental issues: feature representation of traffic patterns, learning from single location or network, and data quality. In order to address these three issues, in this work we present a deep architecture for traffic flow prediction that learns deep hierarchical feature representation with spatio-temporal relations over the traffic network. Furthermore, we design an ensemble learning strategy via random subspace learning to make the model be able to tolerate incomplete data. Accordingly the contributions of this work are summarized as the three points. First, we transform the time series analysis problem into the task of image-like analysis. Benefitting from the image-like data form, we can jointly explore spatio-temporal relations simultaneously by the two-dimension convolution operator. In addition, the proposed model can tolerate the incomplete data, which is very common in traffic application field. Finally, we propose an improved random search based on uniform design in order to optimize hyper-parameters for deep Convolutional Neural Networks (deep CNN). A large range of experiments with various traffic conditions have been performed on the traffic data originated from the California Freeway Performance Measurement System (PeMS). The experimental results corroborate the effectiveness of the proposed approach compared with the state of the art.

Introduction

The task of traffic flow prediction is to estimate the averaged number of vehicles in a specific region and a future time interval given the historical flow data from the global traffic network, which is an essential component of Advanced Traffic Management Systems (ATMSs). Traffic flow prediction is a crucial task for the intelligent traffic management and control, which has become one of the major research fields in Intelligent Transportation Systems (ITSs) [1], [2], [3], [4]. Under the rapid development of advanced traffic perceptual infrastructures and information management systems, it becomes possible and convenience to monitor the realtime traffic condition and collect a large volume of historical traffic data. In the past, we have paid more attention on the collection and storage management of historical data. However, accurate traffic flow forecasting for future time would bring lots of benefits for traveler route planning and traffic management scheduling. It is necessary to analyze the historical traffic data and to construct data-driven traffic condition prediction systems for citizens’ traveling and government agencies’ management. In fact, with the increasing collection of types of traffic data, it is believed that the volume of data is big enough to be used to reveal the intrinsic statistical information and potential relations hidden in the data [5], [6], [7].

Traffic flow prediction is a typical regression problem performed on time series from a traffic network, of which the research field has a life of nearly 40 years [8]. To address the problem, data-driven statistical techniques are employed to explore the intrinsic relations in the collected data. Generally, the statistical techniques used for traffic flow prediction can be categorized into the two types: parametric and nonparametric methods. The early methods belong to the former group, which characterize the collected traffic data as a time series, and estimate future traffic conditions of the given location at the specific time slice by exploring the relations hidden in the historical time-series data based on parametric models [9], [10], [11], [12]. In particular, the linear models are constructed along the temporal dimension. One of the earliest classic methods is Autoregressive-moving Average (ARMA) model [10] which is also a basic statistical method for a general time-series analysis problem. Then the ARMA method has been extended as the Autoregressive Integrated Moving Average (ARIMA) model [12] by adding an extra procedure of difference to find the relations of the temporal variation of traffic flow and then predict the future traffic flow with that information. Since the traffic condition follows strong periodic patterns due to the travelers’ living and working rhythm, it is natural to extend the ARIMA model to the seasonal ARIMA (SARIMA) [9], [13], [14], [15], [16]. Many studies find that the SARIMA model performs better than other ARIMA based models, even than some nonparametric models in some cases [13], [14], [15]. However, it is reported that the application of SARIMA model mainly suffers from the requirement of huge data. In fact, it is not easy to collect traffic data under some circumstances. Later, except for linear parametric models, many nonlinear parametric models have been employed for traffic flow prediction [17].

On the other hand, nonparametric methods with more complicated structures are believed to have a stronger ability to capture the embedded nonlinear characteristics of short-term traffic flow data. A large group of nonparametric models have been explored for traffic flow prediction, which could be classified into two classes: probabilistic graph models and machine learning based models. The graph models introduced in this field include Gaussian process model [18], Bayesian network [19], Markov chain [20], and Markov Random Fields (MRFs) [21]. The connection structure of graph models is shown to be able to represent some nonlinear dependencies. However, this type of methods suffer from the complexity of the inference problem for real-time application. Recently, machine learning based methods attract more research attentions, such as artificial neural networks [22], [23], [24], [21], [25], [26], [27], Support Vector Regression (SVR) [3] and Local Weight Learning (LWL) [28]. In particular, since deep learning shows a significantly better performance in various fields, for example, image classification, speech recognition and natural language processing, and further the traffic flow data are represented as a time series which is similar to the natural language data, many recent works employ deep learning models into the application of traffic flow prediction.

In general, as we know machine learning based nonparametric methods are driven by empirical data. To address the traffic flow prediction problem, from the view point of learning strategy design, we need to consider the following three issues, such as feature representation of traffic patterns, learning from single location or network, and data quality. Like most of other pattern recognition applications, the first problem is how to design the possible optimal representation [29], [30], [31]. Many of existing methods directly used the temporal sequence as a feature vector for simplicity. Others adopt either the sophisticated hand-designed features or the model-based features. However, the domain knowledge of traffic engineers is limited, and they cannot directly explore relations hidden in big datasets. So, it is really difficult for traffic engineers to find the most appropriate hand-designed feature representation for their adopted models. So, the model-based features, for example, deep network representation, are believed to be promising.

The following considered issue is learning from single location or network, which is regarded as one of challenges in traffic flow prediction by [32]. Many existing methods usually construct the prediction models with spatial and temporal dimensions separately. The ARIMA model, for example, predicts the future traffic flow of a road only based on the observed data in the previous time intervals from the road. The historical data from other connected roads are neglected totally. However, it is obvious that the traffic flow is affected by the traffic connected network [33]. It is better to do traffic flow prediction by exploring spatio-temporal relations over the traffic network. As pointed by [32], spatio-temporal jointly analysis is one of the most important obstacles faced in traffic flow prediction. As early as in 1980, a three-stage iterative procedure was designed for space–time modeling [34]. The more commonly-used method is to compute the spatio-temporal autocorrelation to capture spatio-temporal joint relations [35]. In [36], a non-parametric spatio-temporal kernel regression model is developed for prediction task in spatio-temporal series. In order to explore temporal relations, some researchers introduced Long Short-Term Memory (LSTM) into the field of traffic flow prediction. A natural extension is to design a LSTM network to address the spatio-temporal joint problem [37]. The spatial relations considered by all these existing methods are based on the geometric locations of the traffic nodes. In some cases, we have no geometric information from the traffic dataset. However, the spatial relations between the traffic nodes exist indeed. Different from the existing work, in this paper, we try to capture the spatial relations without geometric information.

Finally, few existing researches consider the incomplete data problem which is in fact very severe for practical application [36]. In this work, we employ ensemble learning based on random subspace learning to improve the deep CNN to tolerate incomplete data. The work of this paper is motivated by the above considered three issues. In this paper, we attempt to define a deep architecture for traffic flow prediction that learns deep hierarchical feature representation with spatio-temporal relations over the traffic network. Furthermore, we consider to make the model be able to tolerate incomplete data. The contributions of this work are summarized as the following three points.

  • We transform the time series analysis problem into the task of image-like analysis. Benefitting from the image-like data form, we can jointly explore spatial and temporal relations simultaneously by the two-dimension convolution operator.

  • The proposed model can tolerate the incomplete data, which is very common in traffic application field.

  • We propose an improved random search based on uniform design in order to optimize hyper-parameters for deep CNN.

The rest of this paper is organized as follows. Section 2 gives the basic formulation and introduces corresponding theoretical background. The detail of our proposed method is explained in Section 3. The empirical study is performed in Section 4 and the experimental results are reported and discussed in this section. The final section summarizes the work and suggests some future directions.

Section snippets

Basic formulation and theoretical background

In this section, we summarize the basic formulation for the considered problem and explain the related theoretical background briefly. The task of traffic flow prediction is to estimate the accumulated number of vehicles in a specific region during a future time interval by using the historical flow data from the global traffic network. Formally, given the historical traffic flow data {xi,t|i=1,,N;t=1,,T} from the traffic network, where xi,t denotes the accumulated flow covering the ith

The proposed method

After data preparing we need to consider the model construction. In this section, we present a random subspace learning based deep CNN (RSCNN) for traffic flow prediction. As shown in Fig. 2, our proposed model consists of three parts: random subspace construction, ensemble learning on deep convolutional neural networks and result integration.

Data description

The data used for our experiments are taken from the California Freeway Performance Measurement System (PeMS) [49]. The PeMS dataset is a well-known public benchmark for traffic flow prediction. The original dataset is very large, which covers the freeway traffic data of the nine districts in California for several years. It includes the three types of traffic conditions: raw vehicle count (traffic flow), occupancy (amount of time the loop detector is active) and vehicle speed. Here, we only

Conclusion and future work

The improvements of the short-term traffic flow prediction performance from deep learning bring us the twofold indications. Firstly, we demonstrate that deep learning is useful in this application field and provides a significant improvement over shallow models. In addition, it also suggests us explore other types of deep neural networks, for example, Deep Belief Network (DBN), AutoEncoder, Recurrent Neural Network (RNN) [58], and Long Short Term Memory (LSTM) [59]. The DBN and AutoEncoder are

Acknowledgments

This work is supported by the Fundamental Research Funds for the Central Universities (106112017CDJXY090002, CDJXS10182216), the Science and Technology Research Program of Chongqing Municipal Education Commission (KJQN201800120), the Subproject VI (2018YFF0214706) of National Key Research and Development Program Project (2018YFF0214700), National Natural Science Foundations of China (61672119, 61379158, 61672117), National Program on Key Basic Research Project (973 Program) (2013CB328903), the

References (63)

  • HamiltonA. et al.

    The evolution of urban traffic control: Changing policy and technology

    Transp. Plann. Technol.

    (2017)
  • WuX. et al.

    Data mining with big data

    IEEE Trans. Knowl. Data Eng.

    (2014)
  • StathopoulosA. et al.

    Fuzzy rule-based system approach to combining traffic count forecasts

    Transp. Res. Rec. J. Transp. Res. Board

    (2010)
  • AhmedM.S. et al.

    Analysis of freeway traffic time-series data by using Box-Jenkins techniques

    Transp. Res. Rec.

    (1979)
  • GhoshB. et al.

    Bayesian time-series model for short-term traffic flow forecasting

    J. Transp. Eng.

    (2007)
  • MoorthyC.K. et al.

    Short term traffic forecasting using time series methods

    Transp. Plann. Technol.

    (1988)
  • ThomasT. et al.

    Predictions of urban volumes in single time series

    IEEE Trans. Intell. Transp. Syst.

    (2010)
  • H.R. Kirby, S.M. Watson, M.S. Dougherty, Should we use neural network or statistical models for short-term motorway...
  • B. Ghosh, B. Basu, M.O. Mahony, Time-series modelling for forecasting vehicular traffic flow in Dublin, in: Proceedings...
  • LippiM. et al.

    Short-term traffic flow forecasting: an experimental comparison of time-series analysis and supervised learning

    IEEE Trans. Intell. Transp. Syst.

    (2013)
  • T. Mai, B. Ghosh, S. Wilson, Multivariate short-term traffic flow forecasting using Bayesian vector autoregressive...
  • WilliamsB.M. et al.

    Urban traffic flow prediction: application of seasonal autoreressive integrated moving average and exponential smoothing models

    Transp. Res. Rec. J. Transp. Res. Board

    (1998)
  • SunS. et al.

    Variational inference for infinite mixtures of Gaussian processes with applications to traffic flow prediction

    IEEE Trans. Intell. Transp. Syst.

    (2011)
  • SunS. et al.

    A Bayesian network approach to traffic flow forecasting

    IEEE Trans. Intell. Transp. Syst.

    (2006)
  • G. Yu, J. Hu, C. Zhang, L. Zhuang, J. Song, Short-term traffic flow forecasting based on Markov chain model, in: Proc....
  • ChanK.Y. et al.

    Neural-network-based models for short-term traffic flow forecasting using a hybrid exponential smoothing and LevenbergCMarquardt algorithm

    IEEE Trans. Intell. Transp. Syst.

    (2012)
  • ChanK.Y. et al.

    Selection of significant on-road sensor data for short-term traffic flow forecasting using the Taguchi method

    IEEE Trans. Ind. Inf.

    (2012)
  • LiC. et al.

    Risk-averse energy trading in multienergy microgrids: a two-stage stochastic game approach

    IEEE Trans. Ind. Inform.

    (2017)
  • LiC. et al.

    Efficient computation for sparse load shifting in demand side management

    IEEE Trans. Smart Grid

    (2017)
  • SmithB.L. et al.

    Traffic flow forecasting: Comparison of modeling approaches

    J. Transp. Eng.

    (1997)
  • ChanK.Y. et al.

    Prediction of short-term traffic variables using intelligent swarm-based neural networks

    IEEE Trans. Control Syst. Technol.

    (2017)
  • Cited by (59)

    • Multivariate machine learning-based prediction models of freeway traffic flow under non-recurrent events

      2023, Alexandria Engineering Journal
      Citation Excerpt :

      Due to this, researchers have been adapting unsupervised methods with a hybrid approach instead of a deep learning architect [34,38]. The particular correlation was developed by applying the Deep CNN model to random subspace learning [28]. Complex computations and multi-dimensional inputs can be treated very easily with typical DNN.

    • Multi-precision traffic speed predictions via modified sequence to sequence model and spatial dependency evaluation method

      2022, Applied Soft Computing
      Citation Excerpt :

      Liu et al. [26] regarded the traffic state in cities as pixels of images and built an ensemble prediction model based on CNN for the partitioned grid. Similar works that transformed time-series analysis problems into image-series analysis could be found in these studies [12,14,27]. Moreover, Zhang et al. [13] established a deep spatial–temporal residual network by using the CNN framework to capture the spatial dependence of sub-regions that are generated by grid-based map segmentation and then applied the residual unit [28] to seize the information of different layers.

    View all citing articles on Scopus
    View full text