Simulation smoothing for state–space models: A computational efficiency analysis

doi:10.1016/j.csda.2010.07.009

Computational Statistics & Data Analysis

Volume 55, Issue 1, 1 January 2011, Pages 199-212

https://doi.org/10.1016/j.csda.2010.07.009 Get rights and content

Abstract

Simulation smoothing involves drawing state variables (or innovations) in discrete time state–space models from their conditional distribution given parameters and observations. Gaussian simulation smoothing is of particular interest, not only for the direct analysis of Gaussian linear models, but also for the indirect analysis of more general models. Several methods for Gaussian simulation smoothing exist, most of which are based on the Kalman filter. Since states in Gaussian linear state–space models are Gaussian Markov random fields, it is also possible to apply the Cholesky Factor Algorithm (CFA) to draw states. This algorithm takes advantage of the band diagonal structure of the Hessian matrix of the log density to make efficient draws. We show how to exploit the special structure of state–space models to draw latent states even more efficiently. We analyse the computational efficiency of Kalman-filter-based methods, the CFA, and our new method using counts of operations and computational experiments. We show that for many important cases, our method is most efficient. Gains are particularly large for cases where the dimension of observed variables is large or where one makes repeated draws of states for the same parameter values. We apply our method to a multivariate Poisson model with time-varying intensities, which we use to analyse financial market transaction count data.

Introduction

State–space models are time series models featuring both latent and observed variables. The latent variables have different interpretations according to the application. They may be the unobserved states of a system in biology, economics or engineering. They may be time-varying parameters of a model. They may be factors in dynamic factor models, capturing covariances among a large set of observed variables in a parsimonious way.

Gaussian linear state–space models are interesting in their own right, but they are also useful devices for the analysis of more general state–space models. In some cases, the model becomes a Gaussian linear state–space model, or a close approximation, once we condition on certain variables. These variables may be a natural part of the model, as in Carter and Kohn (1996), or they may be convenient but artificial devices, as in Kim et al. (1998), Stroud et al. (2003) and Frühwirth-Schnatter and Wagner (2006).

In other cases, one can approximate the conditional distribution of states in a non-Gaussian or non-linear model by its counterpart in a Gaussian linear model. If the approximation is close enough, one can use the latter for importance sampling, as Durbin and Koopman (1997) do to compute likelihood functions, or as a proposal distribution in a Metropolis–Hastings update, as Shephard and Pitt (1997) do for posterior Markov chain Monte Carlo simulation.

To fix notation, consider the following Gaussian linear state–space model, expressed using notation from de Jong and Shephard (1995): $y_{t} = X_{t} β + Z_{t} α_{t} + G_{t} u_{t}, t = 1, \dots, n,$ $α_{t + 1} = W_{t} β + T_{t} α_{t} + H_{t} u_{t}, t = 1, \dots, n - 1,$ $α_{1} \sim N (a_{1}, P_{1}), u_{t} \sim i.i.d. N (0, I_{q}),$ where $y_{t}$ is a $p \times 1$ vector of dependent variables, $α_{t}$ is a $m \times 1$ vector of state variables, and $β$ is a $k \times 1$ vector of coefficients. The matrices $X_{t}$ , $Z_{t}$ , $G_{t}$ , $W_{t}$ , $T_{t}$ and $H_{t}$ are known. Eq. (1) is the measurement equation and Eq. (2) is the state equation. Let $y \equiv {(y_{1}^{⊤}, \dots, y_{n}^{⊤})}^{⊤}$ and $α \equiv {(α_{1}^{⊤}, \dots, α_{n}^{⊤})}^{⊤}$ .

We will consider the familiar and important question of simulation smoothing, which is drawing $α$ as a block from its conditional distribution given $y$ . This is an important component of various sampling methods for learning about the posterior distribution of states, parameters and other functions of interest.

Several authors have proposed ways of drawing states in Gaussian linear state–space models using the Kalman filter, including Carter and Kohn (1994), Frühwirth-Schnatter (1994), de Jong and Shephard (1995), and Durbin and Koopman (2002).

Rue (2001) introduces the Cholesky Factor Algorithm (CFA), an efficient way to draw Gaussian Markov Random Fields (GMRFs) based on the Cholesky decomposition of the precision (inverse of variance) of the random field. He also recognizes that the conditional distribution of $α$ given $y$ in Gaussian linear state–space models is a special case of a GMRF. Knorr-Held and Rue (2002) comment on the relationship between the CFA and methods based on the Kalman filter.

Chan and Jeliazkov (2009) describe two empirical applications of the CFA algorithm for Bayesian inference in state–space macroeconomic models. One is a time-varying parameter vector autoregression model for output growth, unemployment, income and inflation. The other is a dynamic factor model for US post-war macroeconomic data.

The Kalman filter is used not only for simulation smoothing, but also to evaluate the likelihood function for Gaussian linear state–space models. We can do the same using the CFA and our method. Both give evaluations of $f (α | y)$ for arbitrary $α$ with little additional computation. We can then evaluate the likelihood as $f (y) = \frac{f (α) f (y | α)}{f (α | y)}$ for any value of $α$ . A convenient choice is the conditional mean of $α$ given $y$ , since it is easy to obtain and simplifies the computation of $f (α | y)$ .

The Kalman filter also delivers intermediate quantities that are useful for computing filtering distributions, the conditional distributions of $α_{1}, \dots, α_{t}$ given $y_{1}, \dots, y_{t}$ , for various values of $t$ . While it is difficult to use the CFA algorithm to compute these distributions efficiently, it is fairly straightforward to do so using our method.

We make four main contributions in this paper. The first is a new method, outlined in Section 2, for drawing states in state–space models. Like the CFA, it uses the precision and co-vector (precision times mean) of the conditional distribution of $α$ given $y$ and does not use the Kalman filter. Unlike the CFA, it generates the conditional means $E [α_{t} | α_{t + 1}, \dots, α_{n}, y]$ and conditional variances $Var [α_{t} | α_{t + 1}, \dots, α_{n}, y]$ as a byproduct. These conditional moments turn out to be useful in an extension of the method, described in McCausland (2008), to non-Gaussian and non-linear state–space models with univariate states. This is because it facilitates Gaussian and other approximations to the conditional distribution of $α_{t}$ given $α_{t + 1}$ and $y$ . With little additional computation, one can also compute the conditional means $E [α_{t} | y_{1}, \dots, y_{t}]$ and variances $Var [α_{t} | y_{1}, \dots, y_{t}]$ , which together specify the filtering distributions, useful for sequential learning.

The second main contribution, described in Section 3, is a careful analysis of the computational efficiency of various methods for drawing states, showing that the CFA and our new method are much more computationally efficient than methods based on the Kalman filter when $p$ is large or when repeated draws of $α$ are required. For the important special case of state–space models, our new method is up to twice as fast as the CFA for large $m$ . We find examples of applications with large $p$ in recent work in macroeconomics and forecasting using “data-rich” environments, where a large number of observed time series is linked to a much smaller number of latent factors. See, for example, Boivin and Giannoni (2006), which estimates Dynamic Stochastic General Equilibrium (DSGE) models, or Stock and Watson, 1999, Stock and Watson, 2002 and Forni et al. (2000), which shows that factor models with large numbers of variables give better forecasts than small-scale vector autoregressive (VAR) models do. Examples with large numbers of repeated draws of $α$ include the evaluation of the likelihood function in non-linear or non-Gaussian state–space models using importance sampling, as in Durbin and Koopman (1997).

Our third contribution is to illustrate these simulation smoothing methods using an empirical application. In Section 4, we use them to approximate the likelihood function for a multivariate Poisson state–space model, using importance sampling. Latent states govern time-varying intensities. Observed data are transaction counts in financial markets.

The final contribution is the explicit derivation, in Appendix A, of the precision and co-vector of the conditional distribution of $α$ given $y$ in Gaussian linear state–space models. These two objects are the inputs to the CFA and our new method.

We conclude in Section 5.

Section snippets

Precision-based methods for simulation smoothing

In this section we discuss two methods for state smoothing using the precision $Ω$ and co-vector $c$ of the conditional distribution of $α$ given $y$ . The first method is due to Rue (2001), who considers the more general problem of drawing Gaussian Markov random fields. The second method, introduced here, offers new insights and more efficient draws for the special case of Gaussian linear state–space models. Both methods involve pre-computation, which one performs once for a given $Ω$ and $c$ , and

Efficiency analysis

We compare the computational efficiency of various methods for drawing $α | y$ . We do this using counts of operations and computational experiments with artificial data.

An empirical application to count models

Durbin and Koopman (1997) show how to compute an arbitrarily accurate evaluation of the likelihood function for a semi-Gaussian state–space model in which the state evolves according to Eq. (2), but the conditional distribution of observations given states is given by a general distribution with density (or mass) function $p (y | α)$ . To simplify, we suppress the notation for dependence on $θ$ , the vector of parameters.

The approach is as follows. The likelihood function $L (θ)$ we wish to evaluate is $L (θ)$

Conclusions

In this paper we introduce a new method for drawing state variables in Gaussian state–space models from their conditional distribution given parameters and observations. The method is quite different from standard methods, such as those of de Jong and Shephard (1995) and Durbin and Koopman (2002), that use Kalman filtering. It is much more in the spirit of Rue (2001), who describes an efficient method for drawing Gaussian random vectors with band diagonal precision matrices. As Rue (2001)

Acknowledgement

Miller would like to thank the Fonds de recherche sur la société et la culture (FQRSC) for financial support.

References (23)

J. Song et al.
Choosing an appropriate number of factors in factor analysis with incomplete data
Computational Statistics and Data Analysis
(2008)
J.H. Stock et al.
Forecasting inflation
Journal of Monetary Economics
(1999)
Boivin, J., Giannoni, M., 2006. DSGE models in a data-rich environment. Working Paper 12772, National Bureau of...
C.K. Carter et al.
On Gibbs sampling for state space models
Biometrika
(1994)
C.K. Carter et al.
Markov chain Monte Carlo in conditionally Gaussian state space models
Biometrika
(1996)
Chan, J.C.C., Jeliazkov, I., 2009. Efficient Simulation and Integrated Likelihood Estimation in State Space Models....
P. de Jong et al.
The simulation smoother for time series models
Biometrika
(1995)
J. Durbin et al.
Monte Carlo maximum likelihood estimation for non-Gaussian state space models
Biometrika
(1997)
J. Durbin et al.
J. Durbin et al.
A simple and efficient simulation smoother for state space time series analysis
Biometrika
(2002)

M. Forni et al.

The generalized dynamic factor model: identification and estimation

Review of Economics and Statistics

(2000)

Cited by (107)

A Bayesian approach for the determinants of bitcoin returns
2024, International Review of Financial Analysis
The aim of this paper is to identify potential determinants of bitcoin returns. We consider a wide range of various determinants including economic, financial and technology-related factors as well as uncertainty and attention indices. The analysis is conducted using LASSO models estimated using both frequentist and Bayesian methods. We evaluate the ability of these estimators to forecast bitcoin returns. The results indicate that a Bayesian LASSO model that takes into account the stochastic volatility and the leverage effect provides the most accurate forecasts. Using this model we are able to identify alternative drivers of bitcoin returns and analyse the underlying mechanisms that affect bitcoin returns.
Precision-based sampling for state space models that have no measurement error
2023, Journal of Economic Dynamics and Control
This article presents a computationally efficient approach to sample from Gaussian state space models. The method is an instance of precision-based sampling methods that operate on the inverse variance-covariance matrix of the states (also known as precision). The novelty is to handle cases where the observables are modeled as a linear combination of the states without measurement error. In this case, the posterior variance of the states is singular and precision is ill-defined. As in other instances of precision-based sampling, computational gains are considerable. Relevant applications include trend-cycle decompositions, (mixed-frequency) VARs with missing variables and DSGE models.
High-dimensional conditionally Gaussian state space models with missing data
2023, Journal of Econometrics
We develop an efficient sampling approach for handling complex missing data patterns and a large number of missing observations in conditionally Gaussian state space models. Two important examples are dynamic factor models with unbalanced datasets and large Bayesian VARs with variables in multiple frequencies. A key observation underlying the proposed approach is that the joint distribution of the missing data conditional on the observed data is Gaussian. Furthermore, the inverse covariance or precision matrix of this conditional distribution is sparse, and this special structure can be exploited to substantially speed up computations. We illustrate the methodology using two empirical applications. The first application combines quarterly, monthly and weekly data using a large Bayesian VAR to produce weekly GDP estimates. In the second application, we extract latent factors from unbalanced datasets involving over a hundred monthly variables via a dynamic factor model with stochastic volatility.
Dividend suspensions and cash flows during the Covid-19 pandemic: A dynamic econometric model
2023, Journal of Econometrics
Firms suspended dividend payments in unprecedented numbers in response to the outbreak of the Covid-19 pandemic. We develop a multivariate dynamic econometric model that allows dividend suspensions to affect the conditional mean, volatility, and jump probability of growth in daily industry-level dividends and demonstrate how the parameters of this model can be estimated using Bayesian Gibbs sampling methods. We find considerable heterogeneity across industries in the dynamics of daily dividend growth and the impact of dividend suspensions.
Measuring macroeconomic uncertainty: A cross-country analysis
2023, European Economic Review
This paper constructs internationally consistent measures of macroeconomic uncertainty. Our econometric framework extracts uncertainty from revisions in data obtained from standardized national accounts. Applying our model to post-WWII real-time data, we estimate macroeconomic uncertainty for 39 countries. The cross-country dimension of our uncertainty data allows us to study the impact of uncertainty shocks under varying degrees of employment protection legislation. Our empirical findings suggest that the effects of uncertainty shocks are stronger and more persistent in countries with low employment protection compared to countries with high employment protection. These empirical findings are in line with a theoretical model under varying firing cost.
A computationally efficient mixture innovation model for time-varying parameter regressions
2023, Econometrics and Statistics
The mixture innovation (MI) model places a spike-and-slab mixture distribution for the innovations of time-varying regression coefficients and permits flexible time variation patterns while allowing for dynamic shrinkage. Despite its appeal, the standard Bayesian algorithm to block sample the vector of 0/1 mixture indicators at each time $t$ needs to evaluate the model likelihood over all its $2^{K}$ scenarios for a regression model with $K$ regressors and becomes impractical when $K$ grows. As an alternative, a new specification of the MI model is proposed in which the 0/1 mixture indicators in the original MI model are approximated by a logistic function of latent continuous variables. As such the model likelihood only needs to be evaluated twice in an Metropolis-Hastings step to block update the latent variables and hence the approximated mixture indicators at each time $t$ , offering large improvement in computational efficiency while keeping the benefits of the MI model. An efficient MCMC algorithm is developed to estimate the new model. A simulation study shows that the new model can achieve the same level of estimation accuracy as the original MI model but at a much smaller computation cost. The new model is further tested in two empirical applications where block sampling the mixture indicators at each time $t$ in the original MI model is practically infeasible.

View all citing articles on Scopus

View full text

Simulation smoothing for state–space models: A computational efficiency analysis

Abstract

Introduction

Section snippets

Precision-based methods for simulation smoothing

Efficiency analysis

An empirical application to count models

Conclusions

Acknowledgement

Computational Statistics and Data Analysis

Journal of Monetary Economics

On Gibbs sampling for state space models

Biometrika

Markov chain Monte Carlo in conditionally Gaussian state space models

Biometrika

The simulation smoother for time series models

Biometrika

Monte Carlo maximum likelihood estimation for non-Gaussian state space models

Biometrika

A simple and efficient simulation smoother for state space time series analysis

Biometrika

The generalized dynamic factor model: identification and estimation

Review of Economics and Statistics