Skip to main content
Log in

Efficient MCMC sampling in dynamic mixture models

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

We show how to improve the efficiency of Markov Chain Monte Carlo (MCMC) simulations in dynamic mixture models by block-sampling the discrete latent variables. Two algorithms are proposed: the first is a multi-move extension of the single-move Gibbs sampler devised by Gerlach, Carter and Kohn (in J. Am. Stat. Assoc. 95, 819–828, 2000); the second is an adaptive Metropolis-Hastings scheme that performs well even when the number of discrete states is large. Three empirical examples illustrate the gain in efficiency achieved. We also show that visual inspection of sample partial autocorrelations of the discrete latent variables helps anticipating whether blocking can be effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Andrieu, C., Moulines, E.: On the ergodicity properties of some adaptive MCMC algorithms. Ann. Appl. Probab. 16(3), 1462–1505 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  • Atchadé, Y., Rosenthal, J.: On adaptive Markov chain Monte Carlo algorithms. Bernoulli 11, 815–828 (2007)

    Article  Google Scholar 

  • Bauwens, L., Lubrano, M., Richard, J.: Bayesian Inference in Dynamic Econometric Models. Oxford University Press, Oxford (1999)

    MATH  Google Scholar 

  • Besag, J., Green, E., Higdon, D., Mengersen, K.: Bayesian computation and stochastic systems (with discussion). Stat. Sci. 10, 3–66 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  • Carter, C., Kohn, R.: On Gibbs sampling for state space models. Biometrika 81, 541–553 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  • Carter, C., Kohn, R.: Semiparametric Bayesian inference for time series with mixed spectra. J. R. Stat. Soc. B 59(1), 255–268 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  • Casella, G., George, E.I.: Explaining the Gibbs sampler. Am. Stat. 46(3), 167–174 (1992)

    MathSciNet  Google Scholar 

  • Chib, S.: Calculating posterior distributions and modal estimates in Markov mixture models. J. Econom. 75, 79–97 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  • Engle, C., Kim, C.-J.: The long-run US/UK real exchange rate. J. Money Credit Bank. 31(3), 335–356 (1999)

    Article  Google Scholar 

  • Fama, E.F., French, K.R.: Common risk factors in the returns on stocks and bonds. J. Financ. Econ. 33(1), 3–56 (1993)

    Article  Google Scholar 

  • Fiorentini, G., Sentana, E., Shephard, N.: Likelihood-based estimation of latent generalized ARCH structures. Econometrica 72(5), 1481–1517 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  • Fruhwirth-Schnatter, S.: Data augmentation and dynamic linear models. J. Time Ser. Anal. 15(2), 183–202 (1994)

    Article  MathSciNet  Google Scholar 

  • Fruhwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer, New York (2006)

    Google Scholar 

  • Gerlach, R., Carter, C., Kohn, R.: Efficient Bayesian inference for dynamic mixture models. J. Am. Stat. Assoc. 95, 819–828 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  • Giordani, P., Kohn, R.: Efficient Bayesian inference for multiple change-point and mixture innovation models. J. Bus. Econ. Stat. 26(1), 66–77 (2008)

    Article  MathSciNet  Google Scholar 

  • Giordani, P., Kohn, R.: Adaptive independent Metropolis-Hastings by fast estimation of mixtures of normals. J. Comput. Graph. Stat. 19(2), 243–259 (2010)

    Article  MathSciNet  Google Scholar 

  • Giordani, P., Kohn, R., van Dijk, D.: A unified approach to nonlinearity, structural change, and outliers. J. Econom. 137, 112–133 (2007)

    Article  Google Scholar 

  • Haario, H., Saksman, E., Tamminen, G.: An adaptive Metropolis algorithm. Bernoulli 7, 223–242 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  • Hamilton, J.D.: A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57(2), 357–384 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  • Kim, C.-J.: Dynamic linear models with Markov-switching. J. Econom. 60, 1–22 (1994)

    Article  MATH  Google Scholar 

  • Kim, C.-J., Nelson, C.R.: State-Space Models with Regime Switching: Classical and Gibbs Sampling Approaches with Applications. Massachusetts Institute of Technology Press, Cambridge (1999)

    Google Scholar 

  • Kim, C.-J., Shephard, N., Chib, S.: Stochastic volatility: likelihood inference and comparison with ARCH models. Rev. Econ. Stud. 65(3), 361–393 (1998)

    Article  MATH  Google Scholar 

  • Koopman, S.: Exact initial Kalman filtering and smoothing for nonstationary time series model. J. Am. Stat. Assoc. 92(440), 1630–1638 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  • Liu, J.S., Wong, W.H., Kong, A.: Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81(1), 27–40 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  • Nott, D.J., Kohn, R.: Adaptive sampling for Bayesian model selection. Biometrika 92(4), 747–763 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  • Omori, Y., Chib, S., Shephard, N., Nakajima, J.: Stochastic volatility with leverage: fast and efficient likelihood inference. J. Econom. 140, 425–449 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  • Roberts, G.O., Rosenthal, J.: Coupling and ergodicity of adaptive MCMC. J. Appl. Probab. 44, 458–475 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  • Roberts, G.O., Rosenthal, J.: Examples of adaptive MCMC. J. Comput. Graph. Stat. 18(2), 349–367 (2009)

    Article  MathSciNet  Google Scholar 

  • Roberts, G.O., Sahu, S.K.: Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler. J. R. Stat. Soc. B 59(2), 291–337 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  • Scott, S.L.: Bayesian methods for hidden Markov models: recursive computing in the 21st century. J. Am. Stat. Assoc. 97(457), 337–351 (2002)

    Article  MATH  Google Scholar 

  • Seewald, W.: Discussion on Parameterization issues in Bayesian inference (by S.E. Hills and F.M. Smith). In: Bernardo, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M. (eds.) Bayesian Statistics, vol. 4, pp. 241–243. Oxford University Press, Oxford (1992)

    Google Scholar 

  • Shephard, N., Pitt, M.: Likelihood analysis of non-Gaussian measurement time series. Biometrika 83(4), 653–667 (1997)

    Article  MathSciNet  Google Scholar 

  • Tierney, L.: Markov chains for exploring posterior distributions. Ann. Stat. 22(4), 1701–1762 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  • Timmermann, A.: Moments of Markov switching models. J. Econom. 96, 75–111 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  • Yang, M.: Some properties of vector autoregressive processes with Markov-switching coefficients. Econom. Theory 16, 23–43 (2000)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alessandro Rossi.

Additional information

The authors are grateful to the participants of the Bayesian Econometrics workshop of the Rimini Centre for Economic Analysis, and to two anonymous referees for helpful comments. The ideas expressed here are those of the authors and do not necessarily reflect the position of the European Commission.

Appendix: Converge of the adaptive MH algorithm

Appendix: Converge of the adaptive MH algorithm

As above we denote S t,t+h−1=(S t ,…,S t+h−1) the random vector defined on the finite space ℵ that contains all possible paths of length h. Our aim is to show that the distribution of \(\{\mathbf{S}_{t,t+h-1}^{(n)}; n \ge 1 \}\) generated by the adaptive MH algorithm given in Sect. 2.4 with transition kernel:

converges to the correct target distribution, say π w (S t,t+h−1)=π(S t,t+h−1), where w underlines the conditioning on \(\mathbf{w} \equiv (\mathbf{s}_{1,t-1},\mathbf{s}_{t+h}^{T},\boldsymbol{\theta},\mathbf{y})\) that is omitted. Here we focus on a generic block since convergence in distribution of the full sequence {S (n);n≥1} to π(S) is ensured by the results in Tierney (1994).

The adaptive proposal \(\tilde{q}_{n}(\mathbf{s}_{t,t+h-1})\) is such that:

$$ \tilde{q}_n(\mathbf{s}_{t,t+h-1}) = \delta q_0( \mathbf{s}_{t,t+h-1}) + (1-\delta)q_n(\mathbf{s}_{t,t+h-1}) $$

where 0<δ<1, q 0(⋅) is the uniform distribution on ℵ, and the adaptive component q n (s t,t+h−1) is detailed in (2.6), (2.8), and (2.10) with the recursions (2.14)–(2.15) for the marginal and transition probabilities. The proof consists of verifying that this adaptive scheme satisfies the sufficient conditions given in Giordani and Kohn (2010):

for some constants K>0 and r>0.

Condition GK1 is immediate since the distributions π(s t,t+h−1) and q n (s t,t+h−1) are defined on the discrete space ℵ, so they are bounded above at 1. Furthermore, the constant term q 0(s t,t+h−1) equals 1/N h, so the two ratios in GK1 are bounded above.

Condition GK2 needs more attention. The weight that the adaptive component attaches to a given block is a function of the marginal and transition probabilities and can be written as:

(A.1)

We first show that the marginal and transition probabilities q n (s t ) and q n (s t+1|s t ) involved in Eq. (A.1) settle down asymptotically. The recursion for marginal probabilities (2.14) yields:

$$ q_n(s_t) - q_{n+1}(s_t) = \frac{1}{n} \bigl[q_n(s_t) - \mathsf{1} \bigl(s_t^{(n-1)}=s_t\bigr)\bigr] = O \bigl(n^{-1}\bigr) $$

whereas the one for transition probabilities (2.15) implies:

Giordani and Kohn (see Lemma 1 in Appendix, 2010) show that Condition GK1 implies uniform ergodicity, i.e. \(T_{n}(\mathbf{s}_{t,t+h-1},\overline{\mathbf{s}}_{t,t+h-1}) \geq \epsilon \pi(\,\overline{\mathbf{s}}_{t,t+h-1})\) for some ϵ>0. This ensures that q n (s t ) is strictly positive asymptotically and thus q n (s t+1|s t )−q n+1(s t+1|s t )=O(n −1).

Let us write q n (s t+k |s t+k−1)−q n+1(s t+k |s t+k−1)=ϵ k with ϵ k =O(n −1), and let C n denote the numerator of q n (s t,t+h−1) in Eq. (A.1). We have:

so C n C n+1=ϵ C =O(n −1). Let D n denote the denominator of q n (s t,t+h−1) in (A.1). Since

$$D_n= \sum_{s_t}\cdots\sum_{s_{t+h-1}} C_n,\qquad D_n - D_{n+1}=\epsilon_D = O\bigl(n^{-1}\bigr) $$

as well. Hence:

which proves Condition GK2.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fiorentini, G., Planas, C. & Rossi, A. Efficient MCMC sampling in dynamic mixture models. Stat Comput 24, 77–89 (2014). https://doi.org/10.1007/s11222-012-9354-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-012-9354-4

Keywords

Navigation