Time-series forecasting through wavelets transformation and a mixture of expert models
Introduction
Many potential applications of predictive models are related with forecasting discrete time series. A discrete time series is a finite or infinite enumerable set of observations of a given variable z(t) ordered according to the parameter time, and denoted as , where N is the size of the time series. A key assumption frequently adopted in time-series forecast is that the statistical properties of the data generator are time independent. The goal is, given part of a time series (named pattern), to predict a future value zt+h, where w is the length of the window on the time series and h≥1 is named the prediction horizon. We will focus on models that are able to explore the regularities of the patterns observed in the past for accurately predicting the short evolution of the system (h small).
Earlier research efforts on time-series forecast focused on linear models typified by ARMA models. Two crucial developments appeared around 1980 [4]:
- 1.
the state-space reconstruction paradigm by time-delay embedding, drew on ideas from differential topology and dynamical systems to provide a technique for recognizing when a time series has been generated by deterministic governing equations and, if so, for understanding the geometrical structure underlying the observed behaviour;
- 2.
the second development was the emergence of the field of machine learning, typified by neural networks, that can adaptively explore a large space of potential models [1], [9].
In the second case, the idea is to make function approximations by using an adaptive model, such as a connectionist network, to learn to emulate the input–output behavior. By moving a window along the discrete time series, we can create a training data set consisting of many sets of input values (patterns) with the corresponding target values. Once the adaptive model has been trained, it can be presented with a pattern of observed values and used to make a prediction for zt+h.
One lesson from all the research already developed on time-series forecast is that there is no one method universally superior to another. In this paper, we describe a method based on a mixture of expert models (MEM) for time-series forecasting. We consider the problem of learning a mapping in which the form of the mapping is different for different regions of the input space. The idea is to construct a specific predictive model for each input space region. To improve the quality of the information available to the models, we perform first a wavelet transformation of the time-series data.
Next, we describe how the paper is organized. Section 2 describes the MEM method. An extensive description of partial least squares (PLS) is included because it has been intensively used in our experiments. Section 3 contains experimental results on two time series distributed by the Santa Fe Institute [11]. Concluding remarks are given in Section 4.
Section snippets
Methodology
In this section we present MEM, a method for time-series forecasting based on a mixture of expert models.
MEM focuses on the problem of learning a mapping in which the form of the mapping is different for different regions of the input space. Although a single homogeneous adaptive model could be applied to this problem, we might expect that the task would be better performed if we assign different “expert” models to tackle each of the different regions, and then use an extra “gating” model,
Experimental results
We have tested the MEM method on two time series taken from the Santa Fe Time Series Prediction Analysis Competition, held during the fall of 1990 under the auspices of the Santa Fe Institute [4], [11].
Concluding remarks
We have presented a system formed by a mixture of expert models (MEM) for time-series forecasting. Since there is no single predictive method universally superior to the others, we have expanded previous implementations of MEM by allowing different types of predictive models to participate in its constitution. After performing a base change with the Haar wavelets transform, the input space was partitioned into disjoint regions by a clustering algorithm, and for each region a benchmark was
Ruy Luiz Milidiú received the Ph.D. degree in operations research from the University of California, Berkeley. He is currently an Assistant Professor in the Informatics Department at the Pontifı́cia Universidade Católica do Rio de Janeiro, Brazil, where he also coordinates the Algorithms Engineering and Neural Networks Lab. His research activity is in data compression, neural networks and systems optimization.
References (11)
- et al.
Partial least squares regression: a tutorial
Analy. Chim. Acta.
(1986) - C.M. Bishop, Neural Networks for Pattern Recognition, Clarendon Press, Oxford,...
- P.A. Devijver, J. Kittler, Pattern Recognition: a Statistical Approach, Prentice-Hall, Englewood Cliffs, NJ,...
- R. Gnanadesikan, Methods for Statistical Data Analysis of Multivariate Observations, Wiley, New York,...
- A.S. Weigend, N.A. Gershenfeld, Forecasting the Future and Understanding the Past, Addison-Wesley, Reading, MA,...
Cited by (45)
Time series forecasting using hybrid arima and ann models based on DWT Decomposition
2015, Procedia Computer ScienceWavelet transform in stock prices forecasting and related financial data
2023, AIP Conference ProceedingsAnomaly Detection by Recombining Gated Unsupervised Experts
2022, Proceedings of the International Joint Conference on Neural NetworksDeepSense: A Physics-Guided Deep Learning Paradigm for Anomaly Detection in Soil Gas Data at Geologic CO<inf>2</inf>Storage Sites
2021, Environmental Science and Technology
Ruy Luiz Milidiú received the Ph.D. degree in operations research from the University of California, Berkeley. He is currently an Assistant Professor in the Informatics Department at the Pontifı́cia Universidade Católica do Rio de Janeiro, Brazil, where he also coordinates the Algorithms Engineering and Neural Networks Lab. His research activity is in data compression, neural networks and systems optimization.
Ricardo Machado received a D.Sc. in Computer Science from the Federal University of Rio de Janeiro, Brazil, in 1985. Until his untimely passing away in late 1997 he was with the Algorithms Engineering and Neural Networks Lab at the Catholic University of Rio de Janeiro, Brazil. Prior to that he was with the IBM Rio Scientific Center, where he conducted research on neural networks applications.
Raúl Renterı́a graduated in computer engineering in 1996 from the Pontifı́cia Universidade Católica of Rio de Janeiro, Brazil. He is currently pursuing the M.S. degree in the Informatics Department there, where he is also a research assistant at the Algorithms Engineering and Neural Networks Lab.
- 1
Deceased.