Elsevier

Neurocomputing

Volume 396, 5 July 2020, Pages 438-450
Neurocomputing

A deep learning based multitask model for network-wide traffic speed prediction

https://doi.org/10.1016/j.neucom.2018.10.097Get rights and content

Abstract

This paper proposes a deep learning based multitask learning (MTL) model to predict network-wide traffic speed, and introduces two methods to improve the prediction performance. The nonlinear Granger causality analysis is used to detect the spatiotemporal causal relationship among various links so as to select the most informative features for the MTL model. Bayesian optimization is employed to tune the hyperparameters of the MTL model with limited computational costs. Numerical experiments are carried out with taxis’ GPS data in an urban road network of Changsha, China, and some conclusions are drawn as follows. The deep learning based MTL model outperforms four deep learning based single task learning (STL) models (i.e., Gated Recurrent Units network, Long Short-term Memory network, Convolutional Gated Recurrent Units network and Temporal Convolutional Network) and three other classic models (i.e., Support Vector Machine, k-Nearest Neighbors and Evolving Fuzzy Neural Network). The nonlinear Granger causality test provides a reliable guide to select the informative features from network-wide links for the MTL model. Compared with two other optimization approaches (i.e., grid search and random search), Bayesian optimization yields a better tuning performance for the MTL model in the prediction accuracy under the budgeted computation cost. In summary, the deep learning based MTL model with nonlinear Granger causality analysis and Bayesian optimization promises the accurate and efficient traffic speed prediction for a large-scale network.

Introduction

Accurate short-term traffic prediction plays a critical role in Intelligent Transportation Systems (ITS) [1], [2]. While, existing short-term traffic prediction methods are mostly applied on a freeway, arterial or corridor level, and seldom employed in a network-wide level. The network-wide traffic prediction remains a challenging task due to the difficulty to collect sufficient traffic data and the complexity to model the interactions in densely populated urban road networks [1]. Therefore, the main motivation of this study is to predict network-wide short-term traffic speed, so as to better understand traffic states of the entire road network rather than a single road link. With taxis’ GPS data from Changsha city, China, this study will develop sophisticated approaches to model the complex interactions in a traffic network and then provide an accurate traffic prediction at a network level.

As a branch of machine learning, the deep learning has drawn great interests in various academic and industrial fields for the past years. Due to learning multiple levels of representation, the deep learning is capable to model complex interactions with vast amounts of data, and has been applied with success in video classification [3], natural language processing [4], [5], image recognition [6], [7], [8], drug discovery [9], etc. Encouraged by these successful applications, the deep learning is increasingly employed in the traffic prediction and achieves attractive performances [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22]. This study attempts to build a deep learning based multitask learning (MTL) model to dig the sophisticated inter-correlation hidden among various links, and to predict traffic speeds of multiple road links simultaneously.

However, the network-wide MTL prediction model may raise an issue of enclosing some irrelevant traffic features, which may negatively influence the prediction accuracy and computational complexity. Hence, the nonlinear Granger causality analysis [23], [24] will be explored to select the most informative features for the MTL model. To the best of our knowledge, it is the first time to introduce the nonlinear Granger causality analysis into the field of traffic prediction. Besides, due to the difficulty and computationally expensive cost of tuning the MTL model, Bayesian optimization will be introduced to optimize its hyperparameters under acceptable computational costs.

To summarize, three objectives exist in this paper. That is, the construction of the deep learning based MTL model to predict the network-wide short-term traffic speed; the selection of the most informative traffic features for the MTL model by the nonlinear Granger causality tests; the employment of Bayesian optimization to tune the MTL model so as to improve the prediction performance. The remainder of this paper is organized as follows. Section 2 reviews previous studies regarding short-term traffic prediction. Section 3 details the deep learning based MTL model, the nonlinear Granger causality approach and the Bayesian optimization algorithm. Numerical experiments are conducted in Section 4. Section 5 draws some interesting conclusions.

Section snippets

Literature review

Over the past few decades, despite most short-term traffic prediction approaches focused on a freeway, arterial or corridor level, some efforts have also been elaborated to carry out the network-wide traffic prediction. Kamarianakis et al. employed parametric time series approaches to predict the short-term traffic speed in an urban road network [26]. Cheng et al. modeled the spatiotemporal autocorrelation structure of road networks by space-time autocorrelation analysis during travel time

Methodology

This section presents a framework of network-wide traffic prediction, which covers three components. After introducing the deep learning based MTL model with Gated Recurrent Units (GRU) [46], an architecture of network-wide traffic prediction model is built. Then, the nonlinear Granger causality analysis is explored to detect the spatiotemporal causal relationship among various links. Finally, Bayesian optimization is investigated to tune the proposed MTL models under limited computational

Study site selection and data description

An urban road network in the central business district (CBD) of Changsha, China is selected as the study scenario (c.f., Fig. 5), which covers 9 signalized intersections and 26 links. The yellow and long numbers of all links in Fig. 5 are given according to the traffic information management system of the traffic police detachment in Changsha, China. GPS data of taxis on these links are sampled every 2 min to estimate traffic information. The data attributes cover link ID, timestamp, the sample

Conclusions

Deep learning approaches become more and more popular in the traffic prediction. However, the majority of existing deep learning based traffic prediction studies usually considered the temporal pattern of traffic evolutions at a single location and did not consider its spatial correlations in the network. To fill the gap, this study presents a deep learning based MTL model to predict the network-wide short-term traffic speed. The proposed MTL model can learn spatiotemporal characteristics in an

Conflict of interest

None.

Acknowledgments

This research has been supported by the National Natural Science Foundation of China (Grants no. 71871227, 71501191, 51475152).

Kunpeng Zhang received his M. S. degree in 2014 from the College of Mechanical Engineering, Zhengzhou University, China. He is currently working toward his Ph.D. degree in the College of Mechanical and Vehicle Engineering, Hunan University, China. His research interests focus on intelligent transportation systems and urban traffic control and management.

References (69)

  • YangS.

    On feature selection for traffic congestion prediction

    Transp. Res. C Emerg. Technol.

    (2013)
  • DongN. et al.

    Support vector machine in crash prediction at the level of traffic analysis zones: assessing the spatial proximity effects

    Accid. Anal. Prev.

    (2015)
  • LiL. et al.

    Robust causal dependence mining in big data network and its application to traffic flow predictions

    Transp. Res. C Emerg. Technol.

    (2015)
  • J. Schmidhuber

    Deep learning in neural networks: an overview

    Neural Netw

    (2015)
  • D. Kwiatkowski et al.

    Testing the null hypothesis of stationarity against the alternative of a unit root

    J. Econ.

    (1992)
  • WangF.Y.

    Parallel control and management for intelligent transportation systems: concepts, architectures, and applications

    IEEE Trans. Intell. Transp. Syst.

    (2010)
  • WuZ. et al.

    Modeling spatial-temporal clues in a hybrid deep learning framework for video classification

  • R. Collobert et al.

    A unified architecture for natural language processing: deep neural networks with multitask learning

  • A. Kumar et al.

    Ask me anything: dynamic memory networks for natural language processing

  • LuX. et al.

    Semi-supervised multitask learning for scene semi-supervised multitask learning for scene recognition

    IEEE Trans. Cybern.

    (2015)
  • LuX. et al.

    Remote sensing scene classification by unsupervised representation learning

    IEEE Trans. Geosci. Remote Sens.

    (2017)
  • LuX. et al.

    Exploring models and data for remote sensing image caption generation

    IEEE Trans. Geosci. Remote Sens.

    (2017)
  • E. Gawehn et al.

    Deep learning in drug discovery

    Mol. Inform.

    (2016)
  • LvY. et al.

    Traffic flow prediction with big data: a deep learning approach

    IEEE Trans. Intell. Transp. Syst.

    (2015)
  • HuangW. et al.

    Deep architecture for traffic flow prediction: deep belief networks with multitask learning

    IEEE Trans. Intell. Transp. Syst.

    (2014)
  • A. Koesdwiady et al.

    Improving traffic flow prediction with weather information in connected cars: a deep learning approach

    IEEE Trans. Veh. Technol.

    (2016)
  • H.F. Yang et al.

    Optimized structure of the traffic flow forecasting model with a deep learning approach

    IEEE Trans. Neural Networks Learn. Syst.

    (2017)
  • Y. Wu, H. Tan, Short-term traffic flow forecasting with spatial-temporal correlation in a hybrid deep learning...
  • ChengQ. et al.

    Analysis and forecasting of the day-to-day travel demand variations for large-scale transportation networks: a deep learning approach

    Tech. Rep.

    (2016)
  • ZhangJ. et al.

    Deep spatio-temporal residual networks for citywide crowd flows prediction

  • MaX. et al.

    Learning traffic as images: a deep convolutional neural network for large-scale transportation network speed prediction

    Sensors

    (2017)
  • E. Baek et al.

    A General Test for Nonlinear Granger Causality: Bivariate Model

    (1992)
  • C. Hiemstra et al.

    Testing for linear and nonlinear Granger causality in the stock price-volume relation

    J. Financ.

    (1994)
  • C.W.J. Granger

    Investigating causal relations by econometric models and cross-spectral methods

    Econometrica

    (1969)
  • Cited by (79)

    • Multi-task label noise learning for classification

      2024, Engineering Applications of Artificial Intelligence
    • Traffic flow and speed forecasting through a Bayesian deep multi-linear relationship network

      2023, Expert Systems with Applications
      Citation Excerpt :

      Sharing the knowledge between the two related tasks can be beneficial to the learning, which is the basic idea of multi-task learning (MTL) (Caruana, 1997; Ruder, 2017; Wang et al., 2016; Zhang & Yang, 2021). An increasing number of researchers have applied MTL to train multiple related tasks jointly to make learning more effective (Deng et al., 2017; Huang et al., 2014; Li et al., 2021; Zhang et al., 2019; Zhang, Zheng et al., 2020; Zhang, Zhu et al., 2020). MTL aims to enhance the generalization performance of the model by exploiting the transferable features and capturing the task relationships, especially when data is under-sampled for a single task.

    View all citing articles on Scopus

    Kunpeng Zhang received his M. S. degree in 2014 from the College of Mechanical Engineering, Zhengzhou University, China. He is currently working toward his Ph.D. degree in the College of Mechanical and Vehicle Engineering, Hunan University, China. His research interests focus on intelligent transportation systems and urban traffic control and management.

    Liang Zheng received the B.S. degree from Central South University, Changsha, China, in 2008, and the M.E. and Ph.D. degrees from Tianjin University,Tianjin, China, in 2010 and 2013,respectively. From 2011 to 2012, he also spent one year as a Joint Doctoral Student with University of Wisconsin, Madison, WI, USA. Since July 2013,he has been with the School of Traffic and Transportation Engineering, Central South University, as an Associate Professor. His research interests cover macro- and micro- scopic traffic flow modeling and simulation, data-driven short-term traffic prediction.

    Zijian Liu received his M. S. degree in 1984 and Ph.D. degree in 2001 from the College of Mechanical and Vehicle Engineering, Hunan University, China. Now he is a professor in the College of Mechanical and Vehicle Engineering in the above university. His research interests include vehicle body design, process system optimization and intelligent transportation systems.

    Ning Jia received his B. S. degree in 2005 from the Department of Management, Shandong University, China, and Ph.D. degree in 2010 from Institute of Systems Engineering, Tianjin University, China. Since 2010, he has been working in the Institute of Systems Engineering, Tianjin University. Now he is an associate professor in the College of Management and Economics in the above university. His research interests include intelligent transportation systems and urban traffic control and management.

    View full text