A new boosting algorithm for improved time-series forecasting with recurrent neural networks
Introduction
The reliable prediction of future values of real-valued time series has many important applications ranging from ecological modeling to dynamic systems control, finance and marketing.
Modeling the system that generated the series is often the first step in the setting up of a forecasting system. It can provide an estimation of future values based on past values. It is possible, at least numerically, to solve the equations of a mathematical model compound of a deterministic equation set where the initial conditions are known so that the evolution of the system can be determined.
Generally the characteristics of the phenomenon which generate the series are unknown. The information available for the prediction is limited to the past values of the series. The relations which describe the evolution should be deduced from these values, in the form of functional relation approximations between the past and the future values.
The most frequently adopted approach to estimate single-step-ahead (SS) future values consists in using a function f which uses the recent history of the time series as input, where x(t), for 0 ⩽ t ⩽ l, is the time series data that can be used for building a model. In multi-step-ahead (MS) prediction, given {x(t), x(t − 1), x(t − 2), …}, a reasonable estimate of x(t + h) has to be looked for, h being the number of steps ahead.
To allow for the building of such models an appropriate function f is needed.
Given their universal approximation properties, multi-layer perceptrons (MLP [3]) are often successful in modeling nonlinear functions f. In this case, a fixed number p of past values is fed into the input layer of the MLP and the output is required so that a future value of the time series can be predicted.
Using a time window of a fixed size has proven to be limiting in many applications: if the time window is too narrow, important information may be left out while, if the window is too wide, useless inputs may cause distracting noise.
Ideally, for a given problem, the size of the time window should be adapted to the context. This can be done by using recurrent neural networks (RNNs) [3], [4] which would be learned by an algorithm based on a gradient such as backpropagation through time (BPTT) algorithm. It is possible to improve the results obtained by several means. One way is to develop a more appropriate training algorithm based on a priori information obtained from application field knowledge (see for example [5]).
It is also possible to adopt general methods which would improve the results of other different models.
One of these is known as ‘boosting’ which was introduced in [6].
The possible small improvement that a “weak” model can obtain compared to random estimate is substantially increased in the boosting algorithm through the sequential construction of several such models that concentrate progressively on the difficult examples of the original learning set. In this paper we will focus on the definition of a boosting algorithm used to improve the prediction performance of RNNs. A new parameter will be introduced, allowing regulation of the boosting effect.
A common problem with the time series forecasting model is the low accuracy of long term forecasts. The estimated value of a variable may be reasonably reliable in the short term, but for longer term forecasts, the estimate is likely to become less accurate.
Yet while reliable MS time series prediction has many important applications and is often the intended outcome, published literature usually considers SS time series prediction. The main reason for this is the increased difficulty of the problems requiring MS prediction and the fact that the results obtained by simple extensions of techniques developed for SS prediction are often disappointing. Moreover, if many different techniques perform rather similarly on SS prediction problems, significant differences show up when extensions of these techniques are carried out on MS problems.
In this paper we will quickly go over the modelling approaches of MS prediction then existing work concerning the use of neural networks for MS prediction will be presented. In Section 3, the ensemble methods will be reviewed before presenting the generic boosting algorithm and in Section 4, the related work on regression will be presented.
Next a definition of RNNs as well as the BPTT associated learning algorithm will be provided. The new boosting algorithm is described in Section 6. Finally, in Section 7, we will look at the results of experiments obtained on three different benchmark tests for SS prediction problems as well as those obtained on two benchmark tests for MS prediction problems, all of which showed an overall improvement in performance.
Section snippets
Modelling approaches for times series
The most common approach to dealing with a prediction problem can be traced back to [7] and consists in using a fixed number M of past values (a fixed-length time window sliding over the time series) when building the prediction:
Most of the current work on SS prediction relies on a result in [8] showing that under several assumptions (among them the absence of noise) it is possible to obtain a perfect estimate of x(t + τ) according to Eqs. (1), (2)
Combining multiple learning methods
The combination of models (classifiers or regressors) is an effective way to improve the performance of the models [31]. The goal of the combination of models is to obtain a more precise estimate than that obtained by a single model. Several effective methods to improve the performance of a simple algorithm by combining the several models were put forward.
The combined methods can be classified into three different groups. The voting methods which include bagging algorithm [32] and the boosting
Boosting methods
The boosting methods are a part of the methods of model aggregation (see for example [38], [39]). These methods make it possible to supplement the initial constraints related to the selection of a model and apply it to base models where the results are at least slightly better than random guessing. The goal is to obtain an aggregate model in which the error is smaller than those obtained by individual models. The basic idea is to increase the diversity so as to maximize the cover of data space.
Recurrent neural networks
RNNs are characterized by the presence of cycles in the graph of interconnections, and are able to model temporal dependencies of unspecified duration between the inputs and the associated desired outputs, by using internal memory. The passage of information from one neuron to another through a connection is not instantaneous (one time step), unlike MLP, and the presence of the loops thus makes it possible to keep the influence of the information for a variable time period, theoretically
Boosting recurrent neural networks
Boosting is a general method for improving the accuracy of any given learning algorithm. It produces a final solution by combining rough and moderately inaccurate decisions offered by different classifiers, which is at least slightly better than random guessing. In boosting, the training set used for each classifier is produced (weighted) based on the performance of the earlier classifier(s) in the series. Therefore, samples incorrectly classified by previous classifiers in the series are more
Experimental results
The boosting algorithm described was used with the learning algorithm BPTT for SS and MS time series forecasting problems. We added several new results to a previous study [53] on SS problem and also added a new study on MS problem.
The first set of experiments was carried out in order to explore the performance of the constructive algorithm and to study the influence of parameter k on its behaviour. The algorithm is used on the sunspots time series and two Mackey-Glass time series (MG17 and
Conclusion and future work
We adapted boosting to the problem of learning time-dependencies in sequential data for predicting future values, adding a new parameter for tuning the boosting influence and using recurrent neural networks as “weak” regressors.
The experimental results that we have obtained show that the boosting algorithm actually improves upon the performance comparatively with the use of only one RNN.
We first compared our results on SS with the results obtained from other combination methods. Results
References (63)
- et al.
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences
(1997) - et al.
Learning long-term dependencies by the selective addition of time-delayed connections to recurrent neural networks
Neurocomputing
(2002) - et al.
State space reconstruction in the presence of noise
Physica D
(1991) - et al.
Multi-step-ahead prediction using dynamic recurrent neural networks
Neural Networks
(2000) Stacked generalization
Neural Networks
(1992)- et al.
Recurrent neural networks can be trained to be maximum a posteriori probability classifiers
Neural Networks
(1995) Dynamical recurrent neural networks: Towards prediction and modelling of dynamical systems
Neurocomputing
(1999)Nonlinear prediction of chaotic time series
Physica
(1989)- et al.
Time-series prediction using a local linear wavelet neural network
Neurocomputing
(2006) Boosting Using Neural Nets
Learning Internal Representations by Error Propagation
A learning algorithm for continually running fully recurrent neural networks
Neural Computation
The strength of weak learnability
Machine Learning
On a method of investigating periodicity in disturbed series with special reference to Wolfer’s sunspot numbers
Philosophical Transactions of the Royal Society of London Series A
Detecting Strange Attractors in Turbulence
Sufficient conditions for error backflow convergence in dynamical recurrent neural networks
Neural Computation
Recurrent neural networks with small weights implement definite memory machines
Neural Computation
Prediction of chaotic time-series using dynamic cell structures and local linear models
Neural Network World
Neural-gas network for vector quantization and its application to time-series prediction
IEEE Transactions on Neural Networks
Stabilization Properties of Multilayer Feedforward Networks with Time-Delays Synapses
Time Series Prediction by Using a Connection Network with Internal Delay Lines
Short term electrical load forecasting with artificial neural networks
Engineering Intelligent Systems
Learning long-term dependencies in NARX recurrent neural networks
IEEE Transactions on Neural Networks
Hierarchical Recurrent Neural Networks for Long-Term Dependencies
A comparison between neural network forecasting techniques – Case study: River flow forecasting
IEEE Transactions on Neural Networks
Learning a simple recurrent neural state space model to behave like Chua’s double scroll
IEEE Transactions on Circuits and Systems-I
Cited by (129)
Bo-LSTM based cross-sectional profile sequence progressive prediction method for metal tube rotate draw bending
2023, Advanced Engineering InformaticsPredicting monthly biofuel production using a hybrid ensemble forecasting methodology
2022, International Journal of ForecastingForecasting stock index price using the CEEMDAN-LSTM model
2021, North American Journal of Economics and FinanceMajority voting ensemble with a decision trees for business failure prediction during economic downturns
2021, Journal of Innovation and KnowledgeCitation Excerpt :The fundamental idea of the combination model is to use the strength of each model to estimate different patterns in the data. Hence, the classifier ensemble systems, which combine several classifiers, achieve higher classification accuracy and efficiency than the original single classifiers (Assaad et al., 2008; Cho et al., 2009; Alfaro-Cortés, Garcia, Gamez, & Elizondo, 2007; Hua et al., 2007; Kim & Kang, 2010; Kittler, 1998; Lee et al., 2006; Maqsood et al., 2004; Optiz & Maclin, 1999; Perrone, Cowan, Tesauro, & Alspector, 1993; West et al., 2005; Yim & Mitchelle, 2002). For ensemble learning, it has been proven that better performance can be achieved by rejecting weak classifiers (Yang, 2011).
Recurrent Neural Networks for Time Series Forecasting: Current status and future directions
2021, International Journal of Forecasting