Skip to main content
Log in

Efficient Bayesian estimation of the multivariate Double Chain Markov Model

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

The Double Chain Markov Model (DCMM) is used to model an observable process \(Y = \{Y_{t}\}_{t=1}^{T}\) as a Markov chain with transition matrix, \(P_{x_{t}}\), dependent on the value of an unobservable (hidden) Markov chain \(\{X_{t}\}_{t=1}^{T}\). We present and justify an efficient algorithm for sampling from the posterior distribution associated with the DCMM, when the observable process Y consists of independent vectors of (possibly) different lengths. Convergence of the Gibbs sampler, used to simulate the posterior density, is improved by adding a random permutation step. Simulation studies are included to illustrate the method. The problem that motivated our model is presented at the end. It is an application to real data, consisting of the credit rating dynamics of a portfolio of financial companies where the (unobserved) hidden process is the state of the broader economy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Azzalini, A., Bowman, A.W.: A look at some data on the old faithful geyser. J. R. Stat. Soc., Ser. C, Appl. Stat. 39, 357–365 (1990)

    MATH  Google Scholar 

  • Bangia, A., Diebold, F.X., Kronimus, A., Schagen, C., Schuermann, T.: Ratings migration and the business cycle, with application to credit portfolio stress testing. J. Bank. Finance 26, 445 (2002)

    Article  Google Scholar 

  • Baum, L.E., Petrie, T.: Statistical inference for probabilistic functions of finite state Markov chains. Ann. Math. Stat. 37, 1554–1563 (1966)

    Article  MathSciNet  MATH  Google Scholar 

  • Berchtold, A.: The double chain Markov model. Commun. Stat., Theory Methods 28, 2569–2589 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  • Berchtold, A.: High-order extensions of the double chain Markov model. Stoch. Models 18, 193–227 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • Boys, R.J., Henderson, D.A.: On determining the order of Markov dependence of an observed process governed by a hidden Markov model. Sci. Program. 10, 795–809 (2002)

    Google Scholar 

  • Cappé, O.: Ten years of HMMs (online bibliography 1989–2000). URL http://www.tsi.enst.fr/~cappe/docs/hmmbib.html

  • Cappé, O., Moulines, E., Rydén, T.: Inference in Hidden Markov Models. Springer, Berlin (2005)

    MATH  Google Scholar 

  • Chib, S.: Calculating posterior distributions and modal estimates in Markov mixture models. J. Econom. 75, 79–97 (1996) Ann. Econ.: Bayes, Bernoullis, and Basel (1993)

    Article  MathSciNet  MATH  Google Scholar 

  • Churchill, G.A.: Stochastic models for heterogeneous DNA sequences. Bull. Math. Biol. 51, 79–94 (1989)

    MathSciNet  MATH  Google Scholar 

  • Eisenkopf, A.: (2008). The real nature of credit rating transitions. Working paper. URL http://ssrn.com/abstract=968311

  • Forney, G.D. Jr.: The Viterbi algorithm. Proc. IEEE 61, 268–278 (1973)

    Article  MathSciNet  Google Scholar 

  • Frühwirth-Schnatter, S.: Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models. J. Am. Stat. Assoc. 96, 194–209 (2001) URl: http://dx.doi.org/10.1198/016214501750333063

    Article  MATH  Google Scholar 

  • Giampieri, G., Davis, M., Crowder, M.: Analysis of default data using hidden Markov models. Quant. Finance 5, 27–34 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Hobert, J.P., Marchev, D.: A theoretical comparison of the data augmentation, marginal augmentation and PX-DA algorithms. Ann. Stat. 36, 532–554 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Hobert, J.P., Roy, V., Robert, C.P.: Improving the convergence properties of the data augmentation algorithm with an application to Bayesian mixture modelling. Stat. Sci. 26, 332–351 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Hughes, J.P., Guttorp, P., Charles, S.P.: A non-homogeneous hidden Markov model for precipitation occurrence. J. R. Stat. Soc., Ser. C, Appl. Stat. 48, 15–30 (1999)

    Article  MATH  Google Scholar 

  • Jarrow, R.A., Lando, D., Turnbull, S.: A Markov model for the term structure of credit risk spreads. Rev. Financ. Stud. 10, 481–523 (1997)

    Article  Google Scholar 

  • Kenny, P., Lennig, M., Mermelstein, P.: A linear predictive HMM for vector-valued observations with applications to speech recognition. Acoust. Speech Signal Process. 38, 220–225 (1990)

    Article  Google Scholar 

  • Kershner, S.: Modeling of multivariate time series using hidden Markov models. PHD thesis, University of California, Irvine (2005)

  • Khare, K., Hobert, J.P.: A spectral analytic comparison of trace-class data augmentation algorithms and their sandwich variants. Ann. Stat. 39, 2585–2606 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Korolkiewicz, M.W., Elliott, R.J.: A hidden Markov model of credit quality. J. Econ. Dyn. Control 32, 3807–3819 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Lanchantin, P., Lapuyade-Lahorgue, J., Pieczynski, W.: Unsupervised segmentation of triplet Markov chains hidden with long-memory noise. Signal Process. 88, 1134–1151 (2008)

    Article  MATH  Google Scholar 

  • Meyn, S.P., Tweedie, R.L.: Markov Chains and Stochastic Stability. Springer, London (1993)

    Book  MATH  Google Scholar 

  • Paliwal, K.: Use of temporal correlation between successive frames in a hidden Markov model based speech recognizer. Acoust. Speech Signal Process. 2, 215–218 (1993)

    Google Scholar 

  • Pieczynski, W.: Multisensor triplet Markov chains and theory of evidence. Int. J. Approx. Reason. 45, 1–16 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Pieczynski, W., Desbouvries, F.: On triplet Markov chains. In: International Symposium on Applied Stochastic Models and Data Analysis (ASMDA 2005) (2005)

    Google Scholar 

  • Poritz, A.B.: Linear predictive hidden Markov models and the speech signal. Acoust. Speech Signal Process. 7, 1291–1294 (1982)

    Google Scholar 

  • Raftery, A.E.: A model for high-order Markov chains. J. R. Stat. Soc. B 47, 528–539 (1985)

    MathSciNet  MATH  Google Scholar 

  • Roy, V.: Spectral analytic comparisons for data augmentation. Stat. Probab. Lett. 82, 103–108 (2012)

    Article  MATH  Google Scholar 

  • Stephens, M.: Dealing with label switching in mixture models. J. R. Stat. Soc., Ser. B, Stat. Methodol. 62, 795–809 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Tanner, M.A., Wong, W.H.: The calculation of posterior distributions by data augmentation (with discussion). J. Am. Stat. Assoc. 82, 528–550 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  • Wellekens, C.: Explicit time correlation in hidden Markov models for speech recognition. Acoust. Speech Signal Process. 12, 384–386 (1987)

    Google Scholar 

Download references

Acknowledgements

The author would like to thank an associate editor, and two referees for their valuable comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dobrin Marchev.

Appendix: Proof of Theorem 1

Appendix: Proof of Theorem 1

Proof

The joint mass function of the hidden states, given the parameters and the observed data as vectors at each time point is

The “typical term” can be written as

by Lemma 1. Now, \(\frac{\mbox{P}(\boldsymbol {x}^{t+2}, \boldsymbol {y}^{t+1} \vert \boldsymbol {y}_{,t},x_{t}, x_{t+1}, \theta)}{\mbox{P}(\boldsymbol {x}^{t+1},\boldsymbol {y}^{t+1}\vert \boldsymbol {y}_{,t}, \theta)}\) depends only on x t+1, and is therefore independent of x t and thus can become the normalizing constant. That is,

(10)

We continue, in more detail, to show

(11)

By the law of total probability and Lemma 1, we have

and consequently from (10) and (11),

This is initialized at t=u 0 by setting \(\mbox{P}(x_{0} \vert \boldsymbol {y}_{,M}, \theta) = \mbox{P}(x_{u_{0}} \vert r)\) to be the same as the Dirichlet prior on D(α 01,…,α 0a ). □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fitzpatrick, M., Marchev, D. Efficient Bayesian estimation of the multivariate Double Chain Markov Model. Stat Comput 23, 467–480 (2013). https://doi.org/10.1007/s11222-012-9323-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-012-9323-y

Keywords

Navigation