Abstract
Gaussian Processes (GPs), with a complex enough additive kernel, provide competitive results in time series forecasting compared to state-of-the-art approaches (arima, ETS) provided that: (i) during training the unnecessary components of the kernel are made irrelevant by automatic relevance determination; (ii) priors are assigned to each hyperparameter. However, GPs computational complexity grows cubically in time and quadratically in memory with the number of observations. The state space (SS) approximation of GPs allows to compute GPs based inferences with linear complexity. In this paper, we apply the SS representation to time series forecasting showing that SS models provide a performance comparable with that of full GP and better than state-of-the-art models (arima, ETS). Moreover, the SS representation allows us to derive new models by, for instance, combining ETS with kernels.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A GP prior with zero mean function and covariance function \(k_{\boldsymbol{\theta }}:\mathbb {R}^p\times \mathbb {R}^p \rightarrow \mathbb {R}^+\), which depends on a vector of hyperparameters \(\boldsymbol{\theta }\).
- 2.
In this work, we include the additive noise v into the kernel by adding a White noise kernel term.
- 3.
A stationary kernel is one which is translation invariant: \( k_{\boldsymbol{\theta }}(x_1, x_2)\) depends only on \(x_1-x_2\), like for instance the Matern and RBF kernels.
- 4.
m is a latent dimension which defines the dimension of the state space. The state is a function of tim.
- 5.
The matrix exponential is \(e^A=I+A+A^2/2!+A^3/3!+\dots \) and, for many matrices A, it can be computed analytically.
- 6.
We also tried a more accurate approximation of the periodic kernel, 11 COS kernels, but it did not provide a significant better performance in the M3 competition.
- 7.
In both cases, we have estimated the kernels hyperparameters using MAP.
- 8.
- 9.
By contrast to arima and ETS, GP and SS models can easily model non-integer seasonality like the ones in the Electricity dataset, see [3] for more details.
References
Bauer, M., van der Wilk, M., Rasmussen, C.E.: Understanding probabilistic sparse Gaussian process approximations. In: Advances in Neural Information Processing Systems, pp. 1533–1541 (2016)
Benavoli, A., Zaffalon, M.: State Space representation of non-stationary Gaussian processes. arXiv preprint arXiv:1601.01544 (2016)
Corani, G., Benavoli, A., Zaffalon, M.: Time series forecasting with Gaussian Processes needs priors. In: Proceedings of the ECML PKDD (2021, accepted). https://arxiv.org/abs/2009.08102
Foreman-Mackey, D., Agol, E., Ambikasaran, S., Angus, R.: Fast and scalable Gaussian process modeling with applications to astronomical time series. Astron. J. 154(6), 220 (2017)
Gneiting, T., Raftery, A.E.: Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc. 102(477), 359–378 (2007)
Hensman, J., Fusi, N., Lawrence, N.D.: Gaussian processes for big data. In: Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, UAI 2013, pp. 282–290. AUAI Press, Arlington (2013)
Hernández-Lobato, D., Hernández-Lobato, J.M.: Scalable Gaussian process classification via expectation propagation. In: Artificial Intelligence and Statistics, pp. 168–176 (2016)
Hyndman, R.J., Athanasopoulos, G.: Forecasting: Principles and Practice, 2nd edn. OTexts, Melbourne (2018). OTexts.com/fpp2
Hyndman, R.J., Khandakar, Y.: Automatic time series forecasting: the forecast package for R. J. Stat. Softw. 26(3), 1–22 (2008). http://www.jstatsoft.org/article/view/v027i03
Jazwinski, A.H.: Stochastic Processes and Filtering Theory. Courier Corporation, New York (2007)
Karvonen, T., Sarkkä, S.: Approximate state-space Gaussian processes via spectral transformation. In: 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. IEEE (2016)
Lloyd, J.R.: GEFCom2012 hierarchical load forecasting: gradient boosting machines and Gaussian processes. Int. J. Forecast. 30(2), 369–374 (2014)
Loper, J., Blei, D., Cunningham, J.P., Paninski, L.: General linear-time inference for Gaussian processes on one dimension. arXiv preprint arXiv:2003.05554 (2020)
Quiñonero-Candela, J., Rasmussen, C.E.: A unifying view of sparse approximate Gaussian process regression. J. Machine Learn. Res. 6, 1939–1959 (2005)
Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. The MIT Press, Cambridge (2006)
Roberts, S., Osborne, M., Ebden, M., Reece, S., Gibson, N., Aigrain, S.: Gaussian processes for time-series modelling. Philos. Trans. Royal Soc. A Math. Phys. Eng. Sci. 371(1984), 20110550 (2013)
Särkkä, S., Hartikainen, J.: Infinite-dimensional Kalman filtering approach to spatio-temporal Gaussian process regression. In: International Conference on Artificial Intelligence and Statistics, pp. 993–1001 (2012)
Sarkka, S., Solin, A., Hartikainen, J.: Spatiotemporal learning via infinite-dimensional Bayesian filtering and smoothing: a look at Gaussian process regression through kalman filtering. Signal Process. Mag. IEEE 30(4), 51–61 (2013)
Schuerch, M., Azzimonti, D., Benavoli, A., Zaffalon, M.: Recursive estimation for sparse Gaussian process regression. Automatica 120, 109–127 (2020)
Snelson, E., Ghahramani, Z.: Sparse Gaussian processes using pseudo-inputs. In: Advances in Neural Information Processing Systems, pp. 1257–1264 (2006)
Solin, A., Särkkä, S.: Explicit link between periodic covariance functions and state space models. In: Artificial Intelligence and Statistics, pp. 904–912. PMLR (2014)
Solin, A., Sarkka, S.: Gaussian quadratures for state space approximation of scale mixtures of squared exponential covariance functions. In: 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. IEEE (2014)
Taylor, S.J., Letham, B.: Forecasting at scale. Am. Stat. 72(1), 37–45 (2018)
Titsias, M.: Variational learning of inducing variables in sparse Gaussian processes. In: van Dyk, D., Welling, M. (eds.) Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, PMLR, Hilton Clearwater Beach Resort, Clearwater Beach, Florida USA, 16–18 April 2009, vol. 5, pp. 567–574 (2009)
Wilson, A., Adams, R.: Gaussian process Kernels for pattern discovery and extrapolation. In: International Conference on Machine Learning, pp. 1067–1075. PMLR (2013)
Wood, S.N.: Generalized Additive Models: An Introduction with R. CRC Press, Boca Raton (2017)
Acknowledgements
The authors acknowledge support from the Swiss National Research Programme 75 “Big Data” Grant No. 407540_167199/1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Benavoli, A., Corani, G. (2021). State Space Approximation of Gaussian Processes for Time Series Forecasting. In: Lemaire, V., Malinowski, S., Bagnall, A., Guyet, T., Tavenard, R., Ifrim, G. (eds) Advanced Analytics and Learning on Temporal Data. AALTD 2021. Lecture Notes in Computer Science(), vol 13114. Springer, Cham. https://doi.org/10.1007/978-3-030-91445-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-91445-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91444-8
Online ISBN: 978-3-030-91445-5
eBook Packages: Computer ScienceComputer Science (R0)