Simulation smoothing for state–space models: A computational efficiency analysis

https://doi.org/10.1016/j.csda.2010.07.009Get rights and content

Abstract

Simulation smoothing involves drawing state variables (or innovations) in discrete time state–space models from their conditional distribution given parameters and observations. Gaussian simulation smoothing is of particular interest, not only for the direct analysis of Gaussian linear models, but also for the indirect analysis of more general models. Several methods for Gaussian simulation smoothing exist, most of which are based on the Kalman filter. Since states in Gaussian linear state–space models are Gaussian Markov random fields, it is also possible to apply the Cholesky Factor Algorithm (CFA) to draw states. This algorithm takes advantage of the band diagonal structure of the Hessian matrix of the log density to make efficient draws. We show how to exploit the special structure of state–space models to draw latent states even more efficiently. We analyse the computational efficiency of Kalman-filter-based methods, the CFA, and our new method using counts of operations and computational experiments. We show that for many important cases, our method is most efficient. Gains are particularly large for cases where the dimension of observed variables is large or where one makes repeated draws of states for the same parameter values. We apply our method to a multivariate Poisson model with time-varying intensities, which we use to analyse financial market transaction count data.

Introduction

State–space models are time series models featuring both latent and observed variables. The latent variables have different interpretations according to the application. They may be the unobserved states of a system in biology, economics or engineering. They may be time-varying parameters of a model. They may be factors in dynamic factor models, capturing covariances among a large set of observed variables in a parsimonious way.

Gaussian linear state–space models are interesting in their own right, but they are also useful devices for the analysis of more general state–space models. In some cases, the model becomes a Gaussian linear state–space model, or a close approximation, once we condition on certain variables. These variables may be a natural part of the model, as in Carter and Kohn (1996), or they may be convenient but artificial devices, as in Kim et al. (1998), Stroud et al. (2003) and Frühwirth-Schnatter and Wagner (2006).

In other cases, one can approximate the conditional distribution of states in a non-Gaussian or non-linear model by its counterpart in a Gaussian linear model. If the approximation is close enough, one can use the latter for importance sampling, as Durbin and Koopman (1997) do to compute likelihood functions, or as a proposal distribution in a Metropolis–Hastings update, as Shephard and Pitt (1997) do for posterior Markov chain Monte Carlo simulation.

To fix notation, consider the following Gaussian linear state–space model, expressed using notation from de Jong and Shephard (1995): yt=Xtβ+Ztαt+Gtut,t=1,,n,αt+1=Wtβ+Ttαt+Htut,t=1,,n1,α1N(a1,P1),uti.i.d.N(0,Iq), where yt is a p×1 vector of dependent variables, αt is a m×1 vector of state variables, and β is a k×1 vector of coefficients. The matrices Xt, Zt, Gt, Wt, Tt and Ht are known. Eq. (1) is the measurement equation and Eq. (2) is the state equation. Let y(y1,,yn) and α(α1,,αn).

We will consider the familiar and important question of simulation smoothing, which is drawing α as a block from its conditional distribution given y. This is an important component of various sampling methods for learning about the posterior distribution of states, parameters and other functions of interest.

Several authors have proposed ways of drawing states in Gaussian linear state–space models using the Kalman filter, including Carter and Kohn (1994), Frühwirth-Schnatter (1994), de Jong and Shephard (1995), and Durbin and Koopman (2002).

Rue (2001) introduces the Cholesky Factor Algorithm (CFA), an efficient way to draw Gaussian Markov Random Fields (GMRFs) based on the Cholesky decomposition of the precision (inverse of variance) of the random field. He also recognizes that the conditional distribution of α given y in Gaussian linear state–space models is a special case of a GMRF. Knorr-Held and Rue (2002) comment on the relationship between the CFA and methods based on the Kalman filter.

Chan and Jeliazkov (2009) describe two empirical applications of the CFA algorithm for Bayesian inference in state–space macroeconomic models. One is a time-varying parameter vector autoregression model for output growth, unemployment, income and inflation. The other is a dynamic factor model for US post-war macroeconomic data.

The Kalman filter is used not only for simulation smoothing, but also to evaluate the likelihood function for Gaussian linear state–space models. We can do the same using the CFA and our method. Both give evaluations of f(α|y) for arbitrary α with little additional computation. We can then evaluate the likelihood as f(y)=f(α)f(y|α)f(α|y) for any value of α. A convenient choice is the conditional mean of α given y, since it is easy to obtain and simplifies the computation of f(α|y).

The Kalman filter also delivers intermediate quantities that are useful for computing filtering distributions, the conditional distributions of α1,,αt given y1,,yt, for various values of t. While it is difficult to use the CFA algorithm to compute these distributions efficiently, it is fairly straightforward to do so using our method.

We make four main contributions in this paper. The first is a new method, outlined in Section 2, for drawing states in state–space models. Like the CFA, it uses the precision and co-vector (precision times mean) of the conditional distribution of α given y and does not use the Kalman filter. Unlike the CFA, it generates the conditional means E[αt|αt+1,,αn,y] and conditional variances Var[αt|αt+1,,αn,y] as a byproduct. These conditional moments turn out to be useful in an extension of the method, described in McCausland (2008), to non-Gaussian and non-linear state–space models with univariate states. This is because it facilitates Gaussian and other approximations to the conditional distribution of αt given αt+1 and y. With little additional computation, one can also compute the conditional means E[αt|y1,,yt] and variances Var[αt|y1,,yt], which together specify the filtering distributions, useful for sequential learning.

The second main contribution, described in Section 3, is a careful analysis of the computational efficiency of various methods for drawing states, showing that the CFA and our new method are much more computationally efficient than methods based on the Kalman filter when p is large or when repeated draws of α are required. For the important special case of state–space models, our new method is up to twice as fast as the CFA for large m. We find examples of applications with large p in recent work in macroeconomics and forecasting using “data-rich” environments, where a large number of observed time series is linked to a much smaller number of latent factors. See, for example, Boivin and Giannoni (2006), which estimates Dynamic Stochastic General Equilibrium (DSGE) models, or Stock and Watson, 1999, Stock and Watson, 2002 and Forni et al. (2000), which shows that factor models with large numbers of variables give better forecasts than small-scale vector autoregressive (VAR) models do. Examples with large numbers of repeated draws of α include the evaluation of the likelihood function in non-linear or non-Gaussian state–space models using importance sampling, as in Durbin and Koopman (1997).

Our third contribution is to illustrate these simulation smoothing methods using an empirical application. In Section 4, we use them to approximate the likelihood function for a multivariate Poisson state–space model, using importance sampling. Latent states govern time-varying intensities. Observed data are transaction counts in financial markets.

The final contribution is the explicit derivation, in Appendix A, of the precision and co-vector of the conditional distribution of α given y in Gaussian linear state–space models. These two objects are the inputs to the CFA and our new method.

We conclude in Section 5.

Section snippets

Precision-based methods for simulation smoothing

In this section we discuss two methods for state smoothing using the precision Ω and co-vector c of the conditional distribution of α given y. The first method is due to Rue (2001), who considers the more general problem of drawing Gaussian Markov random fields. The second method, introduced here, offers new insights and more efficient draws for the special case of Gaussian linear state–space models. Both methods involve pre-computation, which one performs once for a given Ω and c, and

Efficiency analysis

We compare the computational efficiency of various methods for drawing α|y. We do this using counts of operations and computational experiments with artificial data.

An empirical application to count models

Durbin and Koopman (1997) show how to compute an arbitrarily accurate evaluation of the likelihood function for a semi-Gaussian state–space model in which the state evolves according to Eq. (2), but the conditional distribution of observations given states is given by a general distribution with density (or mass) function p(y|α). To simplify, we suppress the notation for dependence on θ, the vector of parameters.

The approach is as follows. The likelihood function L(θ) we wish to evaluate is L(θ)

Conclusions

In this paper we introduce a new method for drawing state variables in Gaussian state–space models from their conditional distribution given parameters and observations. The method is quite different from standard methods, such as those of de Jong and Shephard (1995) and Durbin and Koopman (2002), that use Kalman filtering. It is much more in the spirit of Rue (2001), who describes an efficient method for drawing Gaussian random vectors with band diagonal precision matrices. As Rue (2001)

Acknowledgement

Miller would like to thank the Fonds de recherche sur la société et la culture (FQRSC) for financial support.

References (23)

  • J. Song et al.

    Choosing an appropriate number of factors in factor analysis with incomplete data

    Computational Statistics and Data Analysis

    (2008)
  • J.H. Stock et al.

    Forecasting inflation

    Journal of Monetary Economics

    (1999)
  • Boivin, J., Giannoni, M., 2006. DSGE models in a data-rich environment. Working Paper 12772, National Bureau of...
  • C.K. Carter et al.

    On Gibbs sampling for state space models

    Biometrika

    (1994)
  • C.K. Carter et al.

    Markov chain Monte Carlo in conditionally Gaussian state space models

    Biometrika

    (1996)
  • Chan, J.C.C., Jeliazkov, I., 2009. Efficient Simulation and Integrated Likelihood Estimation in State Space Models....
  • P. de Jong et al.

    The simulation smoother for time series models

    Biometrika

    (1995)
  • J. Durbin et al.

    Monte Carlo maximum likelihood estimation for non-Gaussian state space models

    Biometrika

    (1997)
  • J. Durbin et al.
  • J. Durbin et al.

    A simple and efficient simulation smoother for state space time series analysis

    Biometrika

    (2002)
  • M. Forni et al.

    The generalized dynamic factor model: identification and estimation

    Review of Economics and Statistics

    (2000)
  • Cited by (107)

    • A Bayesian approach for the determinants of bitcoin returns

      2024, International Review of Financial Analysis
    View all citing articles on Scopus
    View full text