Keywords

1 Introduction

Forecasting is an area of interest in most people’s lives. People would like to know what the weather will be like this weekend in order to make appropriate plans. Investors are interested in knowing how well certain stocks will perform before purchasing or selling. Travelers would like to know what the traffic conditions will be like in order to make travel plans in advance. The desire to know the future, or make the best possible predictions of it, and the benefits it brings have driven scientists to develop various forecasting models and study the factors that may improve forecasting accuracies.

The advancements in computing hardware and big data frameworks have enabled scientists to perform big data analytics on increasing amounts of data. In recent years, large numbers of traffic sensors have been deployed throughout major urban areas and freeways in the United States to collect data such as traffic flow, speed, lane occupancy, etc. These increasing amounts of data allow scientists to develop, test and validate forecasting models in order to achieve higher forecasting accuracies, which is a very important task for improving traffic management and control systems. The data used in this study are traffic flow data (number of vehicles per hour) from Jan 1, 2013 to June 30, 2017 reported by traffic sensors deployed throughout the state of Georgia. These data may be accessed from the Georgia Department of Transportation website.Footnote 1 Various forecasting models, including seasonal ARIMA, exponential smoothing and neural networks, are trained, validated and evaluated on these data.

It is very intuitive that weather conditions such as rain affect traffic flow. There have been some work that studies the relationship between weather conditions and traffic flow, such as in [6, 21, 22]. Recent work in [27] used rainfall information to help predict traffic speed; yet surprisingly not as much research has been done to directly incorporate rainfall data into forecasting models to help predict traffic flow. Though a few such studies have emerged in the very recent years and are discussed in the Related Work section. In this work, the dynamic regression model is used to incorporate precipitation data as an exogenous explanatory variable that can be used in combination with the seasonal ARIMA and exponential smoothing while neural networks simply require an extra neuron in the input layer to take into account the rainfall data.

Much of the existing work on forecasting traffic also focuses on the immediate short term (e.g., forecast 15 min ahead into the future); typically, only a couple of months of data from relatively few sensors are utilized (e.g., in [5, 17, 18]). Though short term traffic forecasting is certainly an important task for traffic control systems such as smart traffic lights, relatively long term traffic forecasting can be beneficial to the general public for trip planning in advance, which may also lead to the development and improvements of related software/apps.

Therefore, in order to address these issues, this work aims to complement existing work in the literature with the following contributions: (1) evaluating various commonly used forecasting models on large amounts of traffic flow data covering a great number of locations and times; (2) studying the forecasting powers of the models for both short and relatively long terms, 24 h into the future in this case; and (3) examining the effects of the inclusion of rainfall data on short term traffic flow forecasting.

All implementations of the forecasting models used in this study are available in the ScalaTion project, which is a Scala-based project for analytics, simulation and optimization freely available under an MIT License. For more information about this project, please visit cs.uga.edu/~jam/scalation.html.

The rest of this paper is organized as follows: Sect. 2 discusses the background of the forecasting models. Related Work is discussed in Sect. 3. Section 4 contains Evaluations, in which experimental setups and results are discussed. Finally, Conclusion and Future Work are in Sect. 5.

2 Background

Various forecasting models have been developed throughout the years. Classical statistical models such as ARIMA [1] are still very capable and commonly used today. Machine Learning models such as Neural Networks (NN) are also gaining increasing popularity and attention due to recent progress made in deep learning. This section details a couple of forecasting models that are implemented in ScalaTion and used in this study.

2.1 ARIMA Family of Models

The AutoRegressive Moving Average, or ARMA (pq) model is a commonly used and classical statistical model that may be defined as

$$\begin{aligned} Y_t = \sum _{i = 1}^{p} \phi _{i}Y_{t-i} + \sum _{i = 1}^{q} \theta _{i}\epsilon _{t-i} + \epsilon _t \; , \end{aligned}$$
(1)

where \(Y_t\) is the zero-mean transformed response variable of interest; p and q are the orders of the AutoRegressive(AR) and Moving Average (MA) components, respectively; \(\phi \)’s and \(\theta \)’s are the parameters of the AR and MA components, respectively;Footnote 2 and the error term \(\epsilon _t\) is assumed to be independently and identically distributed (i.i.d.) from \(\mathcal {N}(0, \sigma ^2)\) for some constant variance \(\sigma ^2\). Techniques such as Box-Cox transformations can be applied to stabilize variance while differencing is usually done to stabilize the mean. By combining the ARMA model with differencing, a more general model known as the AutoRegressive Integrated Moving Average, or ARIMA (pdq) model, where d is the order of the differencing, is obtained.

The ARIMA model may be further improved if the data are known to contain seasonality, which is just a repeated pattern over a fixed period of time. This is definitely applicable for the traffic flow data, in which the seasonal period could be one week (e.g., we would expect similar traffic flow on the same road during the morning rush hours of this Monday and previous Mondays). A more general seasonal model may be defined as

$$\begin{aligned} Y_t = \sum _{i = 1}^{p} \phi _{i}Y_{t-i} + \sum _{i = 1}^{q} \theta _{i}\epsilon _{t-i} + \sum _{i = 1}^{P} \varPhi _{i}Y_{t-is} + \sum _{i = 1}^{Q} \varTheta _{i}\epsilon _{t-is} +\epsilon _t \; , \end{aligned}$$
(2)

where s is the seasonal period; P and Q are the orders of the seasonal AR and MA components, respectively; and \(\varPhi \)’s and \(\varTheta \)’s are the parameters of the seasonal AR and MA components, respectively. Seasonal differencing may also be applied as necessary, and this type of model is know as the Seasonal ARIMA, or simply SARIMA \((p,d,q) \times (P,D,Q)_s\) model, where D is the order of the seasonal difference.

In this study, the SARIMA\((1,0,1) \times (0,1,1)_{120}\) model with weekly seasonality of 24 hours/day \(\times \) 5 weekdays/week = 120 hours/week is chosen since it was used in several related studies including [18, 24, 28]. Alternatively, SARIMA models may be selected based on a scoring function in an automated fashion as described in [12]. This automated order selection process is also implemented in ScalaTion. The SARIMA models selected using the automated process based on the AICc criterion as recommended in Sect. 8.9 of [11] may yield better performances than the SARIMA\((1,0,1) \times (0,1,1)_{120}\) model in certain individual traffic flow time series, but can also be overfitted with more parameters than necessary in others. The final overall results of the automated SARIMA models are actually slightly worse than the results obtained from the SARIMA\((1,0,1) \times (0,1,1)_{120}\) model. Therefore only SARIMA\((1,0,1) \times (0,1,1)_{120}\) results are reported in this work.

Dynamic Regression. If an exogenous explanatory variable is available to help model the response (e.g., using precipitation data to help predict traffic flow in this study), then dynamic regression may be used. In a similar way as described in Sect. 9.1 of [11], the dynamic regression model may be defined as

$$\begin{aligned} \epsilon _t = \beta x_t + z_t \end{aligned}$$
(3)

where \(\epsilon _t\) could be the residual of a SARIMA model described in Eq. 2 (or other time series forecasting models such as exponential smoothing); \(x_t\) is the exogenous explanatory variable; and \(z_t\) is the residual for this regression model. The dynamic regression model can therefore be viewed as a two-step process of attempting to explain variabilities within a time series by using both a forecasting model and a regression model.

2.2 Exponential Smoothing

Another popular time series forecasting model is exponential smoothing [3, 9]. Since the traffic flow data are inherently seasonal, triple exponential smoothing [29] is required. Two types of seasonality exist for this model, additive and multiplicative. Additive seasonality is more appropriate for a problem like predicting traffic flow as suggested in Sect. 7.5 of [11], and may be defined as.

$$\begin{aligned} \begin{gathered} S_t = \alpha (Y_t - c_{t-L}) + (1-\alpha )(S_{t-1} + b_{t-1}) \; ,\\ b_t = \beta (S_t - S_{t-1}) + (1-\beta )b_{t-1} \; ,\\ c_t = \gamma (Y_t - S_t) + (1-\gamma )c_{t-L} \; . \end{gathered} \end{aligned}$$
(4)

where \(S_t\) is the smoothed value of \(Y_t\), an observed value of the time series at time t; \(b_t\) is the trend factor; \(c_t\) is the seasonal factor; L is the seasonal period; and \(\alpha \), \(\beta \) and \(\gamma \) are all smoothing parameters, bounded between 0 and 1, that need to be estimated. The standard practice of parameter estimation is to minimize the one-step ahead within sample forecast sum of squared errors (SSE), which is

$$\begin{aligned} SSE = \sum _{t=2}^{n}(Y_t - \hat{Y}_{t|t-1})^2 \; , \end{aligned}$$
(5)

where n represents the number of observations in the data and \(\hat{Y}_{t|t-1}\) is the one-step ahead forecast at time t when given the data up to time \(t-1\).

In this work, the parameters are optimized by minimizing the 12-step ahead, as opposed to the standard 1-step ahead, within sample forecast SSE. If the parameters are optimized by minimizing the 1-step ahead within sample forecast SSE, then preliminary testings show that forecast performances are only good for one-step ahead and rather bad for all subsequent steps. This could be caused by the optimizer’s failure to rely on the seasonal components to minimize the SSE since the most recent lagged value can also be used to effectively minimize SSE for the immediate 1-step ahead. On the other hand, if 12-step ahead within sample forecast SSE needs to be minimized, then the optimizer must rely on the seasonal components of the time series to effectively minimize SSE, and therefore the learned parameters tend to be better suited for both short and relatively long term forecasts.

2.3 Neural Networks

The structure of a neural network consists of multiple layers of artificial neurons. One type of neural networks is feedforward neural networks, in which neurons in a layer may only send signals forward to neurons in the subsequent layer. Other types of neural networks also exist, such as Recurrent Neural Networks (RNN), in which neurons in a layer may also send signals backward. This study uses feedforward neural networks. The process for which a single neuron handles its incoming signals and produces an output signal may be defined as

$$\begin{aligned} a_{\text {out}} = \sigma (\mathbf w \cdot \mathbf{a _\mathbf{in }} + b) \; , \end{aligned}$$
(6)

where \(\mathbf{a _\mathbf{in }}\) is the vector of incoming signals; w is the vector of weights associated with the incoming signals; b represents the bias; \(\sigma \) is the activation function; and \(a_{\text {out}}\) is the output signal/activation.

A standard choice for the cost function for prediction problems is the Mean Squared Error (MSE) between the final output signals of the network and the observed training outputs, which may be defined as

$$\begin{aligned} MSE = \frac{1}{n}\sum _{i = 1}^{n} \big \Vert \mathbf{y _\mathbf i } - \mathbf{a _\mathbf i } \big \Vert ^2 \; , \end{aligned}$$
(7)

where n is the number of training instances; \(\mathbf{a _\mathbf i }\) is the vector of final output signals produced by the i-th training/input instance; and \(\mathbf{y _\mathbf i }\) is the i-th observed output vector. The minimization of the cost function can be done using the stochastic gradient descent with backpropagation [23].

The design of an appropriate neural network model for a particular task such as forecasting traffic flow can be extremely flexible or complex, depending on which side of the coin one chooses to look at. With experimentations and trials, a four-layer neural network structure is adopted for forecasting traffic flow in this work. The input layer takes in the day of the week, time of the day, the most recent 24 h of traffic flow data, and the 24 h of traffic flow data in the previous seasonal period (i.e., if forecasting of a Monday’s traffic flow is desired, then the previous Monday’s traffic flow data were used as inputs). There are two hidden layers, of sizes 40 and 30, and a final output layer of size 24, one for each step ahead forecast. The tanh activation function is used in this neural network. Since tanh can only output values between −1 and 1, and the magnitude of the gradient is also greatest in the domain between −1 and 1, the time series are normalized using Min-Max Normalization to be within the range of -0.8 and 0.8, in order to leave the neural network some room to output values that are slightly greater and less than the maximum and minimum values in the training set, respectively. In terms of parameter tuning, neural network does not really have a straight forward and intuitive way to choose parameters similar to the Box-Jenkins method used to choose parameters of an ARMA model. At times it can also be difficult to explain why certain parameters tuned a certain way simply yield better performances for a certain dataset. A common, and somewhat expensive, approach is to used an automated grid search. In this study, data from eight randomly selected traffic sensors are used for the parameter tuning purposes. For each of the chosen time series, a random starting day is selected, then 3 months of training data were used to train various neural network models with different parameters and the subsequent 2 months of data were used for testing. The final set of parameters of the four-layer neural network used in this study is as follows: number of training epochs is set to 600, the mini-batch size is 20, and learning rate is set to 0.1.

3 Related Work

Recent work in [18] compared various forecasting techniques for short term traffic flow forecasting. The techniques included ARIMA based models, Support Vector Regression (SVR) based models and feedforward neural networks. A total of nine months of data, from January 2009 to September 2009, for sixteen vehicle detector stations were collected from the California Freeway Performance Measurement System (PeMS).Footnote 3 The data were aggregated into 15-minute intervals. The first four months were used to train the models, the next two months for validation and model parameter tuning (for SVR and NN), and the final three months were used for testing. The forecasting accuracies of 15-minute ahead forecasts were used to compare the performances of the models; no forecasts beyond 15 min into the future were produced. The authors concluded that a SARIMA model performed the best overall.

Another recent study in [19] utilized a deep neural network built from stacked autoencoders to predict traffic flow. Data were also collected from a very large number of detectors in the PeMS database for the first three months of 2013. The data were aggregated into 5-minute intervals. The first two months were used for training and the remaining one month was for testing. The proposed deep neural network was compared against Support Vector Machine (SVM), backpropagation neural network and radial basis function neural network. However, the authors did not make it clear on both the inputs and the parameters used in the aforementioned models that were used to compare with the proposed deep neural network. No statistical models such as SARIMA were included. Forecasts for 15, 30, 45 and 60 min into the future were produced and the authors’ proposed deep neural network exhibited superior performances.

Many have studied the impact of weather conditions on traffic. A study done in [13] concluded that traffic flow may be reduced by 14% – 15% during heavy rainfall (>0.25 in/hr) in Toronto. The Highway Capacity Manual [20] and a more recent study done in the Twin Cities metropolitan area [21] also reached a similar conclusion. Another study in [25] claimed that light and moderate rainfall (<0.25 in/hr) can reduce freeway traffic flow by 4% to 10%, while heavy rain can reduce freeway traffic flow by 25% to 30% in Hampton Roads, Virginia. Other related studies include [2, 6].

Not as much research has been done that directly incorporates weather data to help with traffic forecasting historically; a couple of such studies have only emerged in the very recent years. There has also been work that used precipitation data to help forecast traffic speed, such as [10, 27], in which neural network and ARIMA based models were used, respectively. A study in [8] may be one of the earliest ones that attempted to use weather information to help forecasting traffic flow. A neural network model was used but the input weather information was encoded in categorical variables of 0 (clear), 1 (rain) and 2 (snow/ice) because detailed information on rainfall were not available to the authors. Another study in [4] included rainfall data as inputs to a neural network model to forecast traffic flow. However, when the model incorporated with rainfall data was compared with the one without, worse performances were obtained. The authors suggested that the counter-intuitive results could be due to the lack of rainy days in the training instances.

A recent study in [7] used a combination of stationary wavelet transform and neural networks to predict traffic flow with the incorporation of rainfall data. Data from two traffic sensors from Dublin, Ireland were used to evaluate the performance of the authors’ proposed model and the standard feedforward neural network. The study showed that incorporating rainfall data certainly helps to improve prediction, and the authors’ proposed algorithm performed better; though no other models were used for comparisons. A couple more studies using the deep learning approach have emerged in the very recent years. One such study in [16] incorporated weather information such as rain, temperature, humidity, etc., into a deep belief network. Performance comparisons were done with ARIMA and a neural network model of three layers. The authors’ proposed deep belief network outperformed ARIMA significantly and did better than the three-layer neural network, though the margins are not as great. Another work in [30] used a combination of recurrent neural network and gated recurrent unit to predict urban traffic flow. The weather data included precipitation, speed and temperature. The authors demonstrated that the incorporation of weather data can improve forecasting accuracies; however, no other models were used for comparison purposes. Only the authors’ proposed model, with and without weather data, is included in the performance evaluations. A study in [14] compared multiple models, including ARIMA, backpropagation neural network, deep belief network and long short-term memory neural network for forecasting traffic flow. Rainfall data were incorporated into the aforementioned models. Forecasts are produced for the immediately short terms, 10 and 30 min ahead into the future. The authors concluded that long short-term memory neural network is the top-performing model, and the incorporation of rainfall data generally improves forecasting performance for most models that were tested.

4 Evaluations

This section details the description, selection and pre-preprocessing of the datasets, forecasting procedures and evaluation metrics, and the experimental designs and results.

4.1 Dataset Description and Pre-processing

There are 275 permanent road sensors deployed by the Georgia Department of Transportation. The traffic flow for the two separate directions (north and south, or east and west) on a road, as well as the aggregate traffic flow, are recorded for each sensor. For this study, weekday data from January 2013 to June 2017 are used. The data from weekends are excluded from this study since there are usually much less congestions during weekends and the traffic patterns on weekends are different from those of the weekdays. The practice of removing weekends are very common in the literature, as seen in [15, 18, 19, 26]. In addition, only forecasts from 7:00 am to 7:00 pm are evaluated since traffic flow during night times are usually not congested.

Fig. 1.
figure 1

Friday traffic on US 23 in Atlanta, GA

Figure 1 provides an graphical view of traffic flow data of all the Fridays in the year 2013 on a major road in Atlanta, GA. Note there is greater outflowing traffic in the afternoon rush hours as people return home from work. The data from the same traffic sensor but on the opposite direction of the road has a complementary traffic pattern, in which there are more vehicles in the morning rush hours when people attempt to arrive at work on time.

Missing values in the data must be handled since forecasting models expect complete training data. Most commonly, data can be missing for an entire day. This is most likely due to the quality control system that rejects data for an entire day based on rules like “the system will reject any day that does not have data for every hour."Footnote 4 For an entire day of missing data, the hourly historical averages of the same weekday from the past four weeks are used for imputations (i.e., if a Tuesday’s data are missing, the hourly averages of the last four Tuesdays’ data are used to impute the values). Occasionally, data can be missing for an hour of a day, possibly due to the imperfection of the quality control system. In this scenario, a simple linear interpolation is used to impute the value by computing the average of the data from the hour before and the hour after. Lastly, some sensors simply have too many missing values to be useful. In extreme cases, a sensor may contain no data at all. Therefore any sensor that contains more than one year of missing data are excluded from this study. The number of remaining usable traffic sensors is 157.

The usable sensors are further filtered to include mostly urban areas and busy freeways where congestions are most likely to occur on a regular basis. Other sensors that scatter across the state of Georgia, including more rural areas and freeways that are not very busy, are excluded from this study. In particular, traffic sensors from 15 counties, as summarized in Table 1, are included in this study. The total number of sensors from the selected counties is 74, and the overall percentage of missing values is close to 8.5%.

Table 1. Summary of Traffic Sensors

The precipitation data come from the Automated Surface Observing System (ASOS)Footnote 5, a joint program maintained by the National Weather Service (NWS) and the Federal Aviation Administration (FAA). The ASOS sensors are typically placed in airports or air bases. Data are downloaded in hourly resolution through a convenient web interfaceFootnote 6 provided by the Department of Agronomy of Iowa State University. A total of 57 ASOS stations contain records from January 2013 to June 2017, but unfortunately more than half of them contain all 0’s or very little data. The number of usable ASOS sensors is 22.

Each traffic sensor is then paired with its closet ASOS sensor based on GPS coordinates. Roughly 44% of the traffic sensors are paired with ASOS sensors that are located within 5 miles; 76% of the traffic sensors can find ASOS sensors within 10 miles; 86% of the traffic sensors can pair with ASOS sensors within 15 miles; and 90% of the traffic sensors may be paired with ASOS sensors within 20 miles. For the traffic sensors located in two coastal counties in southeastern Georgia, Camden (2 sensors) and Glynn (5 sensors) counties, the closest ASOS sensor is about 75 and 50 miles away, respectively. In the end, 14 out of 22 ASOS sensors are used for pairing with traffic sensors. The 8 remaining ASOS sensors are not located close enough to at least one of the traffic sensors included in this work.

The percentage of missing values in the 14 ASOS sensors is about 1.7%. The missing values are imputed by generating random values from Gaussian distributions, for which the means and variances are computed from the most recent 5 observations.

4.2 Forecast Validation and Performance Metrics

Rolling forecast validation with a fixed window size of w, which is the size of the training set, is used to test the models. The forecasting horizon (h) is the number of steps ahead into the future to produce forecasts. Each forecasting model is trained from w observations, and then tested in the subsequent 8 weeks, which serves as a testing set. Since it may not be feasible to forecast all of 8 weeks at once, only h-step ahead forecasts are produced each time. The forecasting models will continually forecast h steps into the future, each time using the most recent hourly data as inputs, until the end of the testing set has been reached. No forecasts are produced outside the 7:00am to 7:00pm range and the imputed values are excluded from performance comparisons as well. After all forecasts have been made in the testing set, the window of training data would then slide over by 8 weeks, including 8 weeks of new observations and dropping the oldest 8 weeks of data, and then the training and forecasting processes are repeated.

Forecasting accuracies are measured using the Mean Absolute Percentage Error (MAPE) metric, which may be defined as

$$\begin{aligned} MAPE = \frac{1}{n}\sum _{i = 1}^{n}\Big |\frac{y_i - \hat{y_i}}{y_i}\Big | \; , \end{aligned}$$
(8)

where n is the total number of forecast values, \(\hat{y_i}\) is the i-th forecast value, and \(y_i\) is the corresponding observed value.

4.3 Experimental Setups and Empirical Results

The first experiment focuses on evaluating the forecasting performances of various models for both short and relatively long terms. The sliding window w is set to be 12 weeks of hourly traffic flow data, and the forecasting horizon h is set to 24, meaning 24 forecasts are produced ranging from 1-hour ahead to 24-h ahead into the future.

A baseline, weekly historical averages computed from instances in the training set, is also included for comparing with the aforementioned models. Each of the 74 traffic sensors contains two separate time series, one for each direction of the road, yielding a total of 148 univariate traffic flow time series. Roughly over 1.7 million forecast values are produced per model (excluding baseline) per step for all 148 time series. The total number of forecast values for all models, all 24 steps, and all 148 time series is close to 130 million. Since the time series contain different numbers of forecast values due to different numbers of missing values, the final results are aggregated by computing weighted averages across multiple time series and summarized in Fig. 2. All testings are done on a 48-core AMD Opteron Machine from the Sapelo cluster of Georgia Advanced Computing Resource CenterFootnote 7 to facilitate parallel processing.

Fig. 2.
figure 2

Performance comparison of forecasting models

It comes as no surprise that in the immediate short terms, the forecasting models tend to perform well. The SARIMA model produces reasonably accurate forecasts in the immediate short terms and experiences sharp declines in forecasting accuracies up to about step 5. Then the performance drops gradually for all the remaining steps, yet always yielding better results than the baseline. The exponential smoothing model yields the lowest performance overall, and experiences a slow and stead decline in performances. At around step 20, the exponential smoothing model is no longer more effective than the baseline. The performance of the neural network is generally the best, leading in terms of forecasting accuracies up to about 12 steps. Then the neural network exhibits similar performances with SARIMA, and then starts to perform slightly worse at around step 20, yet always remain under the baseline.

The second experiment focuses on using precipitation data to aid in traffic flow forecasting. The sliding window w is expanded to 48 weeks of hourly traffic flow data in order to include more training instances with rainfall. The forecasting horizon h is also reduced to 1 since at any given time point, it’s difficult to make reliable predictions of precipitation in the long term without great expertise in the field of weather forecasting and possibly additional data such as satellite images. The dynamic regression models are used to regress residuals from SARIMA and exponential smoothing models on rainfall data in order to help explain additional variabilities. Only training instances that experience at least a moderate amount of rain (>0.1 in/hr) are considered for the regression.

As for neural network, an additional input neuron representing rainfall is added to the input layer. An additional neuron is also added to each of the two hidden layers, and the number of neurons in the output layer has been reduced to one. It is reasonable to assume that there is a great difference in traffic patterns during hours with no rain at all and hours with some rain. In other words, it would be helpful for the neural network to recognize that there is a greater gap between 0 in. of rain and 0.1 in. of rain than 0.1 in. of rain and 0.2 in. of rain. To simulate this effect, all values of 0’s in the ASOS precipitation datasets were replaced with −1.5 before the data are normalized for training of a neural network.

To forecast, the models first require a prediction of the rainfall 1 h later in order to produce 1-step ahead traffic flow forecasts. The future rainfall is estimated by examining the most recent 3 h and average the values that are greater than 0 in/hr. Forecasts are only produced for the instances in the testing sets with at least moderate rainfall (>0.1 in/hr). Overall, close to 80 thousand forecasts are produced for all models across all time series. The results are summarized in Table 2.

Table 2. Short term forecasting in rainy weather

An additional baseline representing the weekly historical averages in the most recent 12 weeks before the testing set is also included to draw comparisons with the previous experiment. The first baseline is the weekly historical averages computed from the entire training set, which has been expanded to 48 weeks. The manner for which the two baselines take account of the rainfall data is to reduce the historical averages by 15% during heavy rain (>0.25 in/hr) or 10% during moderate rain (>0.1 in/hr), as suggested by a couple of studies discussed in Sect. 3.

From the numbers in Table 2, it is immediately obvious that the two baselines perform very poorly during rainy days. The performances of models are all improved with the inclusion of rainfall data. Neural network remains as the top performer. It is also interesting to note that for 1-step ahead forecast, there is greater gap between the performances of SARIMA and neural network in this experiment than those of the previous experiment. This could be due to the ability of a neural network to better handle complex traffic situations and sudden changes that are potentially due to rainfall or other factors since a neural network can make use of a very large number of parameters.

5 Conclusion and Future Work

In this work we evaluated several commonly used forecasting models for some currently under-researched items such as both short and relatively long term traffic flow forecasting and the incorporation of precipitation data into forecasting models to make better forecasts on rainy days. The neural network model is the top performer overall in both experiments.

In terms of future work, there are several directions that can be pursued in order to further improve upon this work. Bigger datasets with higher resolutions are always helpful. The hourly resolution may not always be sufficient to capture the dynamic nature of traffic patterns in major urban areas and freeways. The Caltrans Performance Measurement System (PeMS)Footnote 8 from the state of California would be a great source of such data. More types of weather data besides precipitation can also be used, similar to a few studies mentioned in the Related Work section. Data on special events, major holidays and traffic accidents can all be incorporated into forecasting models as well to handle the irregular patterns. Spatial dependencies should also be exploited, as traffic flowing into a certain direction at a particular location should be used to help forecast traffic further done the road. Additional forecasting models, such as Recurrent Neural Networks and in particular, Long Short-Term Memory Neural Networks, should be included in order to create a more comprehensive evaluation of forecasting techniques for traffic flow.