Prediction intervals to account for uncertainties in neural network predictions: Methodology and application in bus travel time prediction
Introduction
Artificial neural network models (or neural networks hereafter) are receiving more and more attention in the various aspects of transportation engineering due to their modelling flexibility, predictive ability and generalization potential. Their application ranges from traffic operations (Van Lint et al., 2005, Smith and Demetsky, 1995, Chien et al., 1994, Dharia and Adeli, 2003), incident detection and prediction (Xie et al., 2007, Dia and Rose, 1998) and transportation planning (Dia and Panwai, 2007, Tillema et al., 2006) to infrastructure management (Yang et al., 2006, Mukkamala and Sung, 2003) and environmental studies (Cai et al., 2009, Shiva Nagendra and Khare, 2004). Neural networks have also been adopted in the public transport context to model bus travel times (Kalaputapu and Demetsky, 1995, Jeong and Rilett, 2004, Chen et al., 2007).
Traditionally, neural networks used for prediction purposes give rise to a point prediction when they are presented with a set of input values. However, there is always a degree of uncertainty associated with any point prediction. That uncertainty, as will be discussed shortly in Section 2, is attributable to either structure of the model or the inherent uncertainty in the dataset used for model development. Due to these reasons, point prediction performance deteriorates and predictions become unreliable.
A common problem associated with point predictions is that they deliver no information about different kinds of uncertainty affecting the prediction performance. However, the reliability of point predictions can be enhanced through providing a measure of prediction uncertainty (Khosravi et al., 2010a), or at least by quantifying the extent that each different source contributes into prediction unreliability. This issue has motivated some studies in the transportation literature to provide a prediction range, rather than a point prediction, for the relevant dependent variable. Inherently, the width of these ranges is directly related to the degree of confidence in the point predictions. For instance, studies focussed on predicting travel time variability provide a measure of uncertainty in travel time prediction by quantifying the variance of travel times (e.g. Fu and Rilett, 1998, Pattanamekar et al., 2003, Liu et al., 2005, Li, 2006). These variance values, which indicate the extent of variability/reliability of travel times, would then benefit passengers by helping them to better plan their trips, hence would have a range of applications in intelligent transportation systems. In public transport operations, predicting a range for travel times can assist in defining slack times needed in the scheduling process to maximize on-time performance (Mazloumi et al., 2010). The quality of transit signal priority schemes can also be enhanced by providing an arrival time interval for individual busses at a certain downstream signalized intersection (Kim and Rilett, 2005).
To cope with the weakness of neural networks in providing prediction confidence, one approach is to specify intervals (rather than points) where predictions may lie with a predefined likelihood. Depending on what source of uncertainty is considered by these intervals, different terms are used to specify these measures of confidence, i.e. confidence intervals or prediction intervals. Many previous researchers have quantified confidence intervals. For instance, Van Hinsbergen et al. (2009) and Park and Lee (2004) used Bayesian technique to construct confidence intervals for travel time predictions made by neural networks. From a Bayesian inference perspective, each parameter in a neural network is conceived as a distribution rather than a single value. Consequently, neural network outcomes will also form a distribution, which can be further used to construct intervals around each prediction point. However, the computationally intensive nature of Bayesian technique has limited the application of this approach in confidence estimation for neural network predictions (Dybowski and Roberts, 2001). However, to the best of our knowledge, no work has been completed to construct prediction intervals for neural networks employed in transportation applications.
This paper contributes to understand this domain by demonstrating a relatively straightforward approach, founded in maximum likelihood techniques, for constructing prediction intervals. The maximum likelihood approach, as opposed to the Bayesian algorithm, will give rise to a single value (rather than a distribution) for each model parameter and hence for output values. The paper first discusses the possible sources of uncertainty in neural network predictions. Then, following a general description of neural networks, the concepts of confidence intervals and prediction intervals are presented and techniques to quantify each of them are discussed. The proposed methodology is then applied to predict bus travel times along a bus route in Melbourne, Australia, and its performance is evaluated. The final section of the paper presents the conclusions and identifies directions for future research.
Section snippets
Sources of uncertainty
In the neural network community, it is common to consider two sources for uncertainty associated with neural network outcomes: uncertainty in training dataset and uncertainty in model structure (Heskes, 1997, Papadopoulos et al., 2001, Dybowski and Roberts, 2001). Those sources are discussed separately in the subsections which follow.
Methodology
In this section, the general mathematical structure of a neural network is presented as a foundation for the discussion which follows. Subsequent sections focus on quantification of .
Case study
The methodology detailed in the previous section is used to examine the impact of different sources of uncertainty in travel time predictions in the context of an 8-km-long portion of a bus route in inner Melbourne, Australia. This portion of the route comprises four sections (which are similar in length) and are demarcated by five timing point stops (see Fig. 3). Those timing points are the major bus stops and at each of them, bus arrival/departure times are monitored to maintain consistency
Results
To predict the average travel time for each section, the best model is selected on the basis of the results reported in Table 1. To test the predictive ability of each model on unseen data, each model is now applied on the testing dataset (i.e. 20% of the travel times that have been put aside). The results reported in Table 3 illustrate model performance (in terms of RMSE) by time period. Except for Section 4, the poorest model performance is in peak periods. In Section 4 there is also
Summary and conclusion
Despite existing reports of the successful exploitation of neural networks, the predictions made by neural networks are always prone to uncertainty. In this study, different sources of uncertainty associated with neural network outcomes were discussed, including uncertainty arising from inherent noise in input data, and that due to model structure. Two alternative measures were also introduced to quantify how different uncertainty sources contribute to total prediction uncertainty. Confidence
Acknowledgment
The authors would like to acknowledge Ventura Bus Company and VicRoads for supplying the GPS and SCATS data, respectively, for this research.
References (46)
- et al.
Prediction of hourly air pollutant concentrations near urban arterials using artificial neural network approach
Transportation Research Part D
(2009) - et al.
Neural network model for rapid forecasting of freeway link travel time
Engineering Applications of Artificial Intelligence
(2003) - et al.
Expected shortest paths in dynamic and stochastic traffic networks
Transportation Research Part B
(1998) - et al.
A prediction interval-based approach to determine optimal structures of neural network metamodels
Expert Systems with Applications
(2010) - et al.
Dynamic and stochastic shortest path in transportation networks with two components of travel time uncertainty
Transportation Research Part C: Emerging Technologies
(2003) - et al.
An investigation of model selection criteria for neural network time series forecasting
European Journal of Operational Research
(2001) - et al.
Artificial neural network based line source models for vehicular exhaust emission predictions of an urban roadway
Transportation Research Part D
(2004) - et al.
Comparison of feature selection techniques for ANN-based voltage estimation
Electric Power Systems Research
(2000) - et al.
Bayesian committee of neural networks to predict travel times with confidence intervals
Transportation Research Part C
(2009) - et al.
Accurate freeway travel time prediction with state-space neural networks under missing data
Transportation Research Part C
(2005)
Predicting motor vehicle collisions using Bayesian neural networks: an empirical analysis
Accident Analysis & Prevention
5
An evolutionary algorithm that constructs recurrent neural networks
IEEE Transactions on Neural Networks
Adaptive Control Processes: A Guided Tour
Neural networks for pattern recognition
Using automatic passenger counter data in bus arrival time prediction
Journal of Advanced Transportation
Using neural networks to synthesize origin-destination flows in a traffic circle
Transportation Research Record
‘Modelling drivers’ compliance and route choice behavior in response to travel information
Nonlinear Dynamics
Development and evaluation of neural network freeway incident detection models using field data
Transportation Research Part C
Confidence intervals and prediction intervals for feed-forward neural networks
The jacknife, the bootstrap, and other resampling plans
Society for Industrial and Applied Mathematics
COVNET: a cooperative coevolutionary model for evolving artificial neural networks
IEEE Transactions on Neural Networks
Training feed-forward networks with the Marquardt algorithm
IEEE Transactions on Neural Networks
Practical confidence and prediction interval
Advances in Neural Information Processing Systems
Cited by (110)
Self-Supervised Learning for data scarcity in a fatigue damage prognostic problem
2023, Engineering Applications of Artificial IntelligenceEnhancing early-stage energy consumption predictions using dynamic operational voyage data: A grey-box modelling investigation
2023, International Journal of Naval Architecture and Ocean EngineeringA uncertainty visual analytics approach for bus travel time
2022, Visual InformaticsCitation Excerpt :While existing work has noted the necessity of discussing uncertainty in predicting bus travel times using deep learning models, little attention has been paid to the uncertainty in the neural network models themselves. Mazloumi et al. (2011) helped passengers to plan their trips by constructing prediction intervals for the neural network. They concluded that most of the uncertainty in prediction is related to input data noise and associated with aleatory uncertainty.
Railway tie deterioration interval estimation with Bayesian deep learning and data-driven maintenance strategy
2022, Construction and Building MaterialsBayesian neural networks for uncertainty quantification in data-driven materials modeling
2021, Computer Methods in Applied Mechanics and Engineering