Skip to main content
Log in

Statistical inference for mixture GARCH models with financial application

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In this paper we consider mixture generalized autoregressive conditional heteroskedastic models, and propose a new iteration algorithm of type EM for the estimation of model parameters. The maximum likelihood estimates are shown to be consistent, and their asymptotic properties are investigated. More precisely, we derive simple expressions in closed form for the asymptotic covariance matrix and the expected Fisher information matrix of the ML estimator. Finally, we study the model selection and propose testing procedures. A simulation study and an application to financial real-series illustrate the results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Abramson A, Cohen I (2007) On the stationarity of Markov-switching GARCH processes. Econom Theory 23(3):485–500

    MathSciNet  MATH  Google Scholar 

  • Alexander C, Lazar E (2006) Normal mixture GARCH\((1, 1)\): applications to exchange rate modelling. J Appl Econom 21(3):307–336

    MathSciNet  Google Scholar 

  • Augustyniak M, Boudreault M, Morales M (2018) Maximum likelihood estimation of the Markov-switching GARCH model based on a general collapsing procedure. Methodol Comput Appl Probab 20(1):165–188

    MathSciNet  MATH  Google Scholar 

  • Bauwens L, Hafner CM, Rombouts JVK (2007) Multivariate mixed normal conditional heteroskedasticity. Comput Stat Data Anal 51:3551–3566

    MathSciNet  MATH  Google Scholar 

  • Bauwens L, Preminger A, Rombouts JVK (2010) Theory and inference for a Markov switching GARCH model. Econom J 13:218–244

    MathSciNet  MATH  Google Scholar 

  • Bibi A (2019) QML Estimation of asymmetric Markov switching GARCH\((p, q)\) processes. Commun Math Stat. https://doi.org/10.1007/s40304-019-00197-0

    Article  Google Scholar 

  • Bibi A, Ghezal A (2018) Markov switching bilinear GARCH models: structure and estimation. Commun Stat Theory Methods 47(2):307–323

    MathSciNet  MATH  Google Scholar 

  • Bibi A, Ghezal A (2019) QMLE of periodic time-varying bilinear GARCH models. Commun Stat Theory Methods 48(13):3291–3310

    MathSciNet  Google Scholar 

  • Bickel P, Ritov Y, Ryden T (1998) Asymptotic normality of the maximum likelihood estimator for general hidden Markov models. Ann Stat 26:1614–1635

    MathSciNet  MATH  Google Scholar 

  • Billio M, Cavicchioli M (2017) Markov switching GARCH models: filtering, approximations and duality. In: Corazza M, Legros F, Perna C, Sibillo M (eds) Math Stat Methods Actuar Sci Finance. Springer, Cham

    Google Scholar 

  • Bollerslev T (1986) Generalized autoregressive conditional heteroskedasticity. J Econom 31:307–327

    MathSciNet  MATH  Google Scholar 

  • Cavicchioli M (2014a) Determining the number of regimes in Markov switching VAR and VMA models. J Times Ser Anal 35(2):173–186

    MathSciNet  MATH  Google Scholar 

  • Cavicchioli M (2014b) Autocovariance and linear transformations of Markov switching VARMA processes. Central Eur J Econ Model Econom 6:275–289

    Google Scholar 

  • Cavicchioli M (2018) On mixture autoregressive conditional heteroskedasticity. J Stat Plan Inference 197:35–50

    MathSciNet  MATH  Google Scholar 

  • Chen CWS, Lee S, Chen SY (2016) Local non-stationarity test in mean for Markov switching GARCH models: an approximate Bayesian approach. Comput Stat 31:1–24

    MathSciNet  MATH  Google Scholar 

  • Cheng X, Yu PLH, Li WK (2009) On a dynamic mixture GARCH model. J Forecast 28:247–265

    MathSciNet  Google Scholar 

  • Choi BS (1992) ARMA model identification. Springer, New York

    MATH  Google Scholar 

  • Chuffart T (2015) Selection criteria in regime switching conditional volatility models. Econometrics 3:289–316

    Google Scholar 

  • Das D, Yoo BH (2004) A Bayesian MCMC algorithm for Markov switching GARCH models. In: Series Econometric Society, No. 179, North American Summer Meetings

  • Douc R, Moulines E, Ryden T (2004) Asymptotic properties of the maximum likelihood estimator in autoregressive models with Markov regime. Ann Stat 32:2254–2304

    MathSciNet  MATH  Google Scholar 

  • Engle RF (1982) Autoregressive conditional heteroskedasticity with estimates of the variance of the United Kingdom inflation. Econometrica 50:987–1007

    MathSciNet  MATH  Google Scholar 

  • Francq C, Roussignol M, Zakoïan JM (2001) Conditional heteroskedasticity driven by hidden Markov chains. J Time Ser Anal 22(2):197–220

    MathSciNet  MATH  Google Scholar 

  • Francq C, Zakoïan JM (2005) The \(L^2\)-structures of standard and switching-regime GARCH models. Stoch Process Appl 115:1557–1582

    MATH  Google Scholar 

  • Francq C, Zakoïan JM (2008) Deriving the autocovariances of powers of Markov-switching GARCH models, with applications to statistical inference. Comput Stat Data Anal 52:3027–3046

    MathSciNet  MATH  Google Scholar 

  • Francq C, Zakoïan JM (2012) Strict stationarity and stationary generalized autoregressive conditional heteroscedasticity models. Econometrica 80(2):821–861

    MathSciNet  MATH  Google Scholar 

  • Frühwirth-Schnatter S, Celeux G, Robert CP, eds. (2019) Handbook of Mixture Analysis. CRC Press, Taylor & Francis Group, Boca-Raton, London-New York

  • Gable J, Van Norden S, Vigfusson R (1997) Analytical derivatives for Markov switching models. Comput Econ 10:187–194

    MATH  Google Scholar 

  • Gerlach R, Tuyl F (2006) MCMC methods for comparing stochastic volatility and GARCH models. Int J Forecast 22:91–107

    Google Scholar 

  • Glodek M, Schels M, Schwenker F (2013) Ensemble Gaussian mixture models for probability density estimation. Comput Stat 28:127–138

    MathSciNet  MATH  Google Scholar 

  • Haas M (2010) Skew-normal mixture and Markov-switching GARCH processes. Stud Nonlinear Dyn Econom 14(4):1–56

    MathSciNet  MATH  Google Scholar 

  • Haas M, Mittnik S, Paolella MS (2004a) A new approach to Markov-switching GARCH models. J Financ Econom 2(4):493–530

    Google Scholar 

  • Haas M, Mittnik S, Paolella MS (2004b) Mixed normal conditional heteroskedasticity. J Financ Econom 2(2):211–250

    Google Scholar 

  • Hamadeh T, Zakoïan JM (2011) Asymptotic properties of LS and QML estimators for a class of nonlinear GARCH processes. J Stat Plan Inference 141:488–507

    MathSciNet  MATH  Google Scholar 

  • Hamilton JD (1994) Time series analysis. Princeton University Press, Princeton

    MATH  Google Scholar 

  • Hennecke JS, Rachev ST, Fabozzi FJ, Nikolov M (2011) MCMC based estimation of Markov switching ARMA–GARCH models. Appl Econ 43(3):259–271

    Google Scholar 

  • Krolzig HM (1997) Markov-switching vector autoregressions: modelling, statistical inference and application to business cycle analysis. Springer, Berlin

    MATH  Google Scholar 

  • Lee O (2013) The functional central limit theorem for ARMA–GARCH processes. Econom Lett 121:432–435

    MathSciNet  MATH  Google Scholar 

  • Liu JC (2006) Stationarity of a Markov-switching GARCH model. J Financ Econom 4(4):573–593

    Google Scholar 

  • Lütkepohl H (2007) New introduction to multiple time series analysis, 2nd edn. Springer, Berlin

    MATH  Google Scholar 

  • Magnus JR, Neudecker H (1986) Symmetry, 0–1 matrices and Jacobians. A review. Econom Theory 2:157–190

    Google Scholar 

  • McLachlan G, Krishnan T (1997) The EM algorithm and extensions. Wiley, New York

    MATH  Google Scholar 

  • McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York

    MATH  Google Scholar 

  • Tse YK, Tsui AKC (2002) A multivariate generalized autoregressive conditional heteroscedasticity model with time-varying correlations. J Bus Econ Stat 20(3):351–362

    MathSciNet  Google Scholar 

  • Zhang Z, Li WK, Yuen KC (2006) On a mixture GARCH time-series model. J Time Ser Anal 27(4):577–597

    MathSciNet  MATH  Google Scholar 

  • Wong CS, Li WK (2000) On a mixture autoregressive model. J R Stat Soc B 62:95–115

    MathSciNet  MATH  Google Scholar 

  • Wong CS, Li WK (2001) On a mixture autoregressive conditional heteroskedastic model. J Am Stat Assoc 96:982–995

    MATH  Google Scholar 

Download references

Acknowledgements

This work is financially supported by a FAR 2020 research grant of the University of Modena and Reggio E., Italy. We should like to thank the Editor in Chief of the journal, Professor Wataru Sakamoto, and two anonymous referees for their constructive comments and very useful suggestions and remarks which were most valuable for improvement of the final version of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maddalena Cavicchioli.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Proof of Proposition 3.1

Let \(f(y_t | \mathbf{z}_t, \mathbf{Y}_{t - 1}; {\varvec{\lambda }})\) denote the conditional density, and \(Pr(\mathbf{z}_t | \mathbf{z}_{t - 1}; {\varvec{\lambda }})\) the conditional probability, for every \(t = 1, \dots , T\).

Set

$$\begin{aligned} f(\mathbf{Y}_T | \mathbf{z}; {\varvec{\lambda }}) \, = \, \prod _{t = 1}^T\, f(y_t | \mathbf{z}_t, \mathbf{Y}_{t - 1}; {\varvec{\lambda }}) \end{aligned}$$

and

$$\begin{aligned} Pr(\mathbf{z} | {\varvec{\lambda }}) \, = \, \prod _{t = 1}^T \, Pr(\mathbf{z}_t | \mathbf{z}_{t - 1}; {\varvec{\lambda }}). \end{aligned}$$

Then the likelihood function can be written as

$$\begin{aligned} L({\varvec{\lambda }} | \mathbf{Y}_T) \, = \, \int \, f(\mathbf{Y}_T, \mathbf{z} | {\varvec{\lambda }})\, d\mathbf{z} \, = \, \int \, f(\mathbf{Y}_T | \mathbf{z}; {\varvec{\lambda }})\, Pr(\mathbf{z} | {\varvec{\lambda }}) \, d\mathbf{z} \end{aligned}$$

where the integration denotes summation over all possible values of \(\mathbf{z} \, = \, \mathbf{z}_T \otimes \mathbf{z}_{T - 1} \otimes \cdots \otimes \mathbf{z}_1\). Here \(\otimes \) is the usual Kronecker product. The derivation of the log likelihood function \(\mathcal{L}({\varvec{\lambda }} | \mathbf{Y}_T) \, = \, {\text {ln}} \, L({\varvec{\lambda }} | \mathbf{Y}_T)\) with respect to \({\varvec{\lambda }}\) leads to the score function \(S({\varvec{\lambda }})\).

We have

$$\begin{aligned} \begin{aligned} \frac{\partial \, {{{\mathcal {L}}}}({\varvec{\lambda }} | \mathbf{Y}_T)}{\partial \, {\varvec{\lambda }}}&\, = \, \frac{\partial \, {\text {ln}} \, L({\varvec{\lambda }} | \mathbf{Y}_T)}{\partial \, {\varvec{\lambda }}} \\&\, = \, \frac{1}{L({\varvec{\lambda }} | \mathbf{Y}_T)} \, \int \, \frac{\partial \, f(\mathbf{Y}_T | \mathbf{z}; {\varvec{\lambda }})}{\partial \, {\varvec{\lambda }}} \, Pr (\mathbf{z} | {\varvec{\lambda }}) \, d\mathbf{z} \\&\, = \, \frac{1}{L({\varvec{\lambda }} | \mathbf{Y}_T)} \, \int \, \frac{\partial \, {\text {ln}} \, f(\mathbf{Y}_T | \mathbf{z}; {\varvec{\lambda }})}{\partial \, {\varvec{\lambda }}} \, f(\mathbf{Y}_T | \mathbf{z}; {\varvec{\lambda }}) \, Pr (\mathbf{z} | {\varvec{\lambda }}) \, d\mathbf{z} \\&\, = \, \int \, \frac{\partial \, {\text {ln}} \, f(\mathbf{Y}_T | \mathbf{z}; {\varvec{\lambda }})}{\partial \, {\varvec{\lambda }}} \, Pr(\mathbf{z} | \mathbf{Y}_T ; {\varvec{\lambda }}) \, d\mathbf{z} \end{aligned} \end{aligned}$$

as

$$\begin{aligned} \frac{f(\mathbf{Y}_T | \mathbf{z}; {\varvec{\lambda }}) \, Pr (\mathbf{z} | {\varvec{\lambda }})}{L({\varvec{\lambda }} | \mathbf{Y}_T)} \, = \, \frac{f(\mathbf{Y}_T | \mathbf{z}; {\varvec{\lambda }}) \, Pr (\mathbf{z} | {\varvec{\lambda }})}{\int \, f(\mathbf{Y}_T, \mathbf{z} | {\varvec{\lambda }}) \, d\mathbf{z}} \, = \, Pr (\mathbf{z} | \mathbf{Y}_T; {\varvec{\lambda }}) \end{aligned}$$

from the Bayes theorem. In the second line of the above computation, we have used the fact that \(Pr (\mathbf{z} | {\varvec{\lambda }}) \, = \, E(\mathbf{z} | {\varvec{\lambda }})\) is a constant function with respect to \({\varvec{\lambda }}\), hence its derivative vanishes.

Thus the score conditioned on a given regime vector \(\mathbf{z}\) has the expression

$$\begin{aligned} S({\varvec{\lambda }}) \, = \, \frac{\partial \, \mathcal{L}({\varvec{\lambda }} | \mathbf{Y}_T)}{\partial \, {\varvec{\lambda }}} \, = \, \sum _{t = 1}^T \, S_t ({\varvec{\lambda }}) \end{aligned}$$

where

$$\begin{aligned} S_t ({\varvec{\lambda }}) \, = \, \frac{\partial \, \ell _t}{\partial \, {\varvec{\lambda }}} \, = \, \sum _{m = 1}^M \, \frac{\partial \, {\text {ln}} \, f(y_t | z_{tm}, \mathbf{Y}_{t - 1}; {\varvec{\lambda }})}{\partial \, {\varvec{\lambda }}} \, Pr(z_{tm} | \mathbf{Y}_t; {\varvec{\lambda }}). \end{aligned}$$

This proves (7) and (8).

Since

$$\begin{aligned} Pr (z_{tm} | \mathbf{Y}_t; {\varvec{\lambda }}) \, = \, E(z_{tm} | \mathbf{Y}_t; {\varvec{\lambda }}) \, = \, \tau _{tm | t}, \end{aligned}$$

we get the following relation

$$\begin{aligned} S_t ({\varvec{\lambda }}_m) \, = \, \frac{\partial \, \ell _t}{\partial \, {\varvec{\lambda }}_m} \, = \, \frac{\partial \, {\text {ln}} \, f(y_t | z_{tm}, \mathbf{Y}_{t - 1}; {\varvec{\lambda }})}{\partial \, {\varvec{\lambda }}_m} \, \tau _{tm | t} \end{aligned}$$
(19)

where

$$\begin{aligned} {\text {ln}}\, f(y_t | z_{tm}, \mathbf{Y}_{t - 1}; {\varvec{\lambda }}) \, = \, {\text {ln}}\, \pi _m \, - \, \frac{1}{2} \, {\text {ln}}\, h_{tm} \, - \, \frac{1}{2} \, \frac{(y_t \, - \, \mu _m)^2}{h_{tm}}. \end{aligned}$$
(20)

Deriving (20) with respect to \({\varvec{\theta }}_m\) gives

$$\begin{aligned} \begin{aligned} \frac{\partial \, {\text {ln}} \, f(y_t | z_{tm}, \mathbf{Y}_{t - 1}; {\varvec{\lambda }})}{\partial \, {\varvec{\theta }}_m}&\, = \, - \, \frac{1}{2} \, \frac{1}{h_{tm}} \, \frac{\partial \, h_{tm}}{\partial \, {\varvec{\theta }}_m} \, + \, \frac{1}{2} \, (y_t \, - \, \mu _m)^{2} \, h_{tm}^{- 2} \, \frac{\partial \, h_{tm}}{\partial \, {\varvec{\theta }}_m} \\&\, = \, -\, \frac{1}{2} \, \frac{\partial \,{\text {ln}}\, h_{tm}}{\partial \, {\varvec{\theta }}_m} \, + \, \frac{1}{2} \, \frac{(y_t \, - \, \mu _m)^{2}}{h_{tm}}\, \frac{\partial \, {\text {ln}} \, h_{tm}}{\partial \, {\varvec{\theta }}_m} \\&\, = \, \frac{1}{2} \, \left[ \frac{(y_t \, - \, \mu _m)^{2}}{h_{tm}} \, - \, 1\right] \, \frac{\partial \, {\text {ln}} \, h_{tm}}{\partial \, {\varvec{\theta }}_m}. \end{aligned} \end{aligned}$$
(21)

Substituting (21) into (19) yields formula (6) in Proposition 3.1.

Deriving (20) with respect to \(\mu _m\) and \(\pi _m\) produces

$$\begin{aligned} \frac{\partial \, {\text {ln}}\, f(y_t | z_{tm}, \mathbf{Y}_{t - 1}; {\varvec{\lambda }})}{\partial \, \mu _m} \, = \, - \, \frac{1}{2} \, \frac{2 (y_t \, - \, \mu _m)}{h_{tm}} \, ( - 1) \, = \, (y_t \, - \, \mu _m)\, h_{tm}^{- 1} \end{aligned}$$
(22)

and

$$\begin{aligned} \frac{\partial \, {\text {ln}}\, f(y_t | z_{tm}, \mathbf{Y}_{t - 1}; {\varvec{\lambda }})}{\partial \, \pi _m} \, = \, \pi _{m}^{- 1}. \end{aligned}$$
(23)

Substituting (22) resp. (23) into (19) gives formulae in (9). This completes the proof of Proposition 3.1.

Derivation of (10). From (8) and (9) we get

$$\begin{aligned} \frac{\partial \, J({\varvec{\lambda | \mathbf{Y}}_T})}{\partial \, \pi _m}\, = \, \frac{\partial \, {{{\mathcal {L}}}}({\varvec{\lambda }} | \mathbf{Y}_T)}{\partial \, \pi _m}\, - \, \lambda \, = \, \sum _{t = 1}^T \,\frac{\partial \, \ell _t}{\partial \, \pi _m}\, - \, \lambda \, = \, \sum _{t = 1}^T \, \pi _{m}^{- 1} \, \tau _{t m | t}\, - \, \lambda = 0 \end{aligned}$$

hence

$$\begin{aligned} \lambda \, = \, \sum _{t = 1}^T \, {{\hat{\pi }}}_{m}^{- 1} \, {\hat{\tau }}_{t m | t}. \end{aligned}$$

Note that the expectation of \(\frac{\partial \, z_{t m}}{\partial \, \pi _m}\), conditional on \(\mathbf{Y}_{t - 1}\), vanishes. Summing up over \(m = 1, \dots , M\) produces

$$\begin{aligned} \lambda \, = \, \lambda \, \left( \sum _{m = 1}^M \, {{\hat{\pi }}}_m \right) \, = \, \sum _{m = 1}^M \, \sum _{t = 1}^T \, {\hat{\tau }}_{t m | t} \, = \, \sum _{t = 1}^T \, \left( \sum _{m = 1}^M \, {{\hat{\tau }}}_{t m | t}\right) \, = \, \sum _{t = 1}^T \, 1 \, = \, T. \end{aligned}$$

Thus relation (10) holds.

Derivation of (11). From (8) and (9) we have

$$\begin{aligned} \frac{\partial \, J({\varvec{\lambda }} | \mathbf{Y}_T)}{\partial \, \mu _m} \, = \, \frac{\partial \, {{{\mathcal {L}}}}({\varvec{\lambda }} | \mathbf{Y}_T)}{\partial \, \mu _m} \, = \, \sum _{t = 1}^T \, S_t (\mu _m) \, = \, \sum _{t = 1}^T \, h_{t m}^{- 1} \, (y_t \, - \, \mu _m)\, {\tau }_{t m | t} \, = \, 0 \end{aligned}$$

hence

$$\begin{aligned} \sum _{t = 1}^T \, {{\hat{h}}}_{t m}^{- 1} \, y_t \, {{\hat{\tau }}}_{t m | t} \, = \, {{\hat{\mu }}}_m \, \sum _{t = 1}^T \, {{\hat{h}}}_{t m}^{- 1} \, {{\hat{\tau }}}_{t m | t}. \end{aligned}$$

Note that the expectation of \(\frac{\partial \, z_{t m}}{\partial \, \mu _m}\), conditional on \(\mathbf{Y}_{t - 1}\), vanishes. Then we have

$$\begin{aligned} {{\hat{\mu }}}_m \, = \left[ \sum _{t = 1}^T \, {{\hat{h}}}_{t m}^{- 1} \, {{\hat{\tau }}}_{t m | t} \right] ^{- 1} \, \left[ \sum _{t = 1}^T \, y_t \, {{\hat{h}}}_{t m}^{- 1} \, {{\hat{\tau }}}_{t m | t} \right] . \end{aligned}$$

But \(y_t = {\mu }_m^{0} \, + \, \epsilon _t\) and \(\epsilon _t \, = \, \sqrt{h_{t m}^{0}}\, \eta _t\), where \({\mu }_m^0\) is the true value of the intercept and \(h_{tm}^0\) denote the function \(h_{t m}\) evaluated at the true value \({\varvec{\lambda }}_{m}^{0}\). Substituting this relation into the above expression of \({\hat{\mu }}_m \) yields (11). \(\square \)

Proof of Proposition 3.2

The consistency of \({\hat{\pi }}_m\) follows immediately from (10).

Using (11) and (12), the consistency of \({{\hat{\mu }}}_m \) follows from the sequence

$$\begin{aligned} \begin{aligned} {\text {plim}}_{T \rightarrow \infty } \, {{\hat{\mu }}}_m \,&= \, {\mu }_m^0 \, + \, \frac{{\text {plim}}_{T \rightarrow \infty } \, T^{- 1} \, \sum _{t = 1}^T \, (y_t \, - \, \mu _m^0) \, {{\hat{h}}}_{t m}^{- 1} \, {{\hat{\tau }}}_{t m | t}}{{\text {plim}}_{T \rightarrow \infty } \, T^{- 1} \, \sum _{t = 1}^T \, {{\hat{h}}}_{t m}^{- 1} \, {{\hat{\tau }}}_{t m | t}}\\&= \, {\mu }_m^0 \, + \, \frac{E_{\infty }(\sqrt{{h}_{t m}^{0}} \, \eta _t \, {{\hat{h}}}_{t m}^{- 1} \, {{\hat{\tau }}}_{t m | t})}{E_{\infty } ({{\hat{h}}}_{t m}^{- 1} \, {{\hat{\tau }}}_{t m | t})}\\&= \, {\mu }_m^0 \, + \, E_{\infty } (\eta _t) \, \frac{E_{\infty }(\sqrt{{h}_{t m}^{0}} \, {{\hat{h}}}_{t m}^{- 1} \, {{\hat{\tau }}}_{t m | t})}{E_{\infty } ({{\hat{h}}}_{t m}^{- 1} \, {{\hat{\tau }}}_{t m | t})}\\&\, = \, {\mu }_m^0 \, + \, E_{\infty } (\eta _t) \, C \, = \, 0 \end{aligned} \end{aligned}$$

as \(\eta _t\) is independent of \(s_t\) and \(h_{t m}\). Here C is a finite real constant. It remains to prove the consistency of \({\hat{\varvec{\theta }}}_m\) for all \(m = 1, \dots , M\). Using (6) and the FOC condition

$$\begin{aligned} \frac{\partial \, J({\varvec{\lambda }} | \mathbf{Y}_T)}{\partial \, {\varvec{\theta }}_m} \, = \, \frac{\partial \, \mathcal{L}({\varvec{\lambda }} | \mathbf{Y}_T)}{\partial \, {\varvec{\theta }}_m} \, = \, \sum _{t = 1}^T \, S_t({\varvec{\theta }}_m)\, = \,{{\varvec{0}}} \end{aligned}$$

we have

$$\begin{aligned} \sum _{t = 1}^T \, \left[ \frac{1}{{{\hat{h}}}_{t m}}\, \frac{\partial \, h_{t m}}{\partial \, {\varvec{\theta }}_m} |_{{\varvec{\theta }}_m \, = \, {\hat{\varvec{\theta }}}_m} \, - \, \frac{ (y_t \, - \, \hat{\mu }_m)^{2}}{{{\hat{h}}}_{t m}^{2}}\, \frac{\partial \, h_{t m}}{\partial \, {\varvec{\theta }}_m} |_{{\varvec{\theta }}_m \, = \, {\hat{\varvec{\theta }}}_m} \right] \, {\hat{\tau }}_{t m | t} \, = \, {{\varvec{0}}} \end{aligned}$$

or, equivalently,

$$\begin{aligned} \sum _{t = 1}^T \, \left[ \frac{{{\hat{h}}}_{t m} \, - \, (y_t \, - \, {{\hat{\mu }}}_m)^2}{{{\hat{h}}}_{t m}^{2}}\, \frac{\partial \, h_{t m}}{\partial \, {\varvec{\theta }}_m} |_{{\varvec{\theta }}_m \, = \, {\hat{\varvec{\theta }}}_m}\right] \, {{\hat{\tau }}}_{t m | t} \, = \, {{\varvec{0}}}. \end{aligned}$$

Note that the expectation of \(\frac{\partial \, z_{t m}}{\partial \, {\varvec{\theta }}_m}\), conditional on \(\mathbf{Y}_{t - 1}\), vanishes. Substituting \(y_t = \mu _{m}^0 \, + \, \epsilon _t = \mu _{m}^0 \, + \sqrt{h_{t m}^{0}} \, \eta _t\) into the last equation yields

$$\begin{aligned} \sum _{t = 1}^T \, \left[ \frac{{{\hat{h}}}_{t m} \, - \, ( \mu _{m}^0 \, + \sqrt{h_{t m}^{0}} \, \eta _t \, - \, {{\hat{\mu }}}_m)^2}{{{\hat{h}}}_{t m}}\, \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_m} |_{{\varvec{\theta }}_m \, = \, {\hat{\varvec{\theta }}}_m}\right] \, {{\hat{\tau }}}_{t m | t} \, = \, {{\varvec{0}}} \end{aligned}$$

hence

$$\begin{aligned} \begin{aligned} \sum _{t = 1}^T \,&\Bigg [\frac{{{\hat{h}}}_{t m} \, - \, h_{t m}^{0}\, \eta _{t}^{2} \, - \, (\mu _{m}^0 \, - \, {{\hat{\mu }}}_m)^2 \, - \, 2 \, \sqrt{h_{t m}^{0}} \, \eta _t \, (\mu _{m}^0 \, - \, {{\hat{\mu }}}_m)}{{{\hat{h}}}_{t m}}\, \Bigg .\\&\quad \times \,\Bigg .\frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_m} |_{{\varvec{\theta }}_m \, = \, {\hat{\varvec{\theta }}}_m}\Bigg ] \, {{\hat{\tau }}}_{t m | t} \, = \, {{\varvec{0}}}. \end{aligned} \end{aligned}$$

Taking the first Taylor expansion of \({{\hat{h}}}_{t m}\) around \(h_{t m}^{0}\), i.e.,

$$\begin{aligned} {{\hat{h}}}_{t m} \, = \, h_{t m}^{0} \, + \, \frac{\partial \, h_{t m}}{\partial \, {\varvec{\theta }}_{m}^{'}} |_{{\varvec{\theta }}_m \, = \, {\varvec{\theta }}_{m}^{0}}\, ( {\hat{\varvec{\theta }}}_m \, - \, {\varvec{\theta }}_{m}^{0}) \end{aligned}$$

and substituting into the previous equation yields

$$\begin{aligned} \begin{aligned} \sum _{t = 1}^T \, \{&\left[ h_{t m}^{0} \, + \, \frac{\partial \, h_{t m}}{\partial \, {\varvec{\theta }}_{m}^{'}} |_{{\varvec{\theta }}_m \, = \, {\varvec{\theta }}_{m}^{0}}\, ( {\hat{\varvec{\theta }}}_m \, - \, {\varvec{\theta }}_{m}^{0}) \, - \, h_{t m}^{0}\, \eta _{t}^{2} \, - \, (\mu _{m}^{0} \, - \, {{\hat{\mu }}}_m)^2 \, \right. \\&\left. \quad - \, 2 \, \sqrt{h_{t m}^{0}} \, {\eta }_t \, (\mu _{m}^{0} \, - \, {{\hat{\mu }}}_m)\right] \, \frac{1}{{{\hat{h}}}_{t m}}\, \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_m} |_{{\varvec{\theta }}_m \, = \, {\hat{\varvec{\theta }}}_m} \} \, {{\hat{\tau }}}_{t m | t} \, = \, {{\varvec{0}}}. \end{aligned} \end{aligned}$$

Multiplying the last equation by \(T^{- 1}\) and taking the limit when T goes to infinity, we have

$$\begin{aligned} \begin{aligned}&{\text {plim}}_{T \rightarrow \infty } \, T^{- 1} \, \left[ \sum _{t = 1}^T \, \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_m} |_{{\varvec{\theta }}_m \, = \, {\hat{\varvec{\theta }}}_m} \, \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_{m}^{'}} |_{{\varvec{\theta }}_m \, = \, {\hat{\varvec{\theta }}}_m} \, {{\hat{\tau }}}_{t m | t}\right] \\&\quad \times \, {\text {plim}}_{T \rightarrow \infty } ( {\hat{\varvec{\theta }}}_m \, - \, {\varvec{\theta }}_{m}^{0})\\&\quad \, = \, {{{\mathcal {I}}}}_m \, {\text {plim}}_{T \rightarrow \infty } ( {\hat{\varvec{\theta }}}_m \, - \, {\varvec{\theta }}_{m}^{0})\, = \, {{\varvec{0}}} \end{aligned} \end{aligned}$$

as \(E(\eta _t^2) = 1\) and \( {\text {plim}}_{T \rightarrow \infty } \, {{\hat{\mu }}}_m \, = \, {\mu }_m^0\), for all \(m = 1, \dots , M\). By assumption (13) the \(r \times r\) matrix \({{{\mathcal {I}}}}_m\) is finite and nonsingular. It follows that

$$\begin{aligned} {\text {plim}}_{T \rightarrow \infty } {\hat{\varvec{\theta }}}_m \, = \, {\varvec{\theta }}_{m}^{0}, \end{aligned}$$

which proves the consistency of \({\hat{\varvec{\theta }}}_m\). \(\square \)

Proof of Proposition 3.3

Set \({\varvec{\lambda }}_m^0 = {\varvec{\lambda }}_m\) to simplify notation. The asymptotic variance of \({{\hat{\pi }}}_m\) is given by

$$\begin{aligned} {\text {var}}_{\infty }({{\hat{\pi }}}_m) = E({{\hat{\pi }}}_m^2) \, - \, [E({{\hat{\pi }}}_m)]^2 \, = \, \pi _m \, - \, \pi _m^2 \, = \, \pi _m (1 - \pi _m). \end{aligned}$$

The aymptotic variance of \({{\hat{\mu }}}_m\) is given by

$$\begin{aligned} \begin{aligned} {\text {var}}_{\infty }[\sqrt{T} \, ({{\hat{\mu }}}_m \, - \, \mu _m^o)]&= T \, E[({{\hat{\mu }}}_m \, - \, \mu _m^o)^2]\\&= \, T \, E\{[\sum _{t = 1}^T \, {{\hat{h}}}_{t m}^{- 1}\, {{\hat{\tau }}}_{t m | t}]^{- 2}\, [\sum _{t = 1}^T \, \epsilon _t \, {{\hat{h}}}_{t m}^{- 1}\, {{\hat{\tau }}}_{t m | t}]^{ 2}\}\\&= T \, E\{[T^{- 1} \, \sum _{t = 1}^T \, {{\hat{h}}}_{t m}^{- 1}\, {{\hat{\tau }}}_{t m | t}]^{- 2}\, [T^{- 1} \, \sum _{t = 1}^T \, \epsilon _t \, {{\hat{h}}}_{t m}^{- 1}\, {{\hat{\tau }}}_{t m | t}]^{ 2}\}\\&= T \, E\{[T^{- 1} \, \sum _{t = 1}^T \, {{\hat{h}}}_{t m}^{- 1}\, {{\hat{\tau }}}_{t m | t}]^{- 2}\, [T^{- 2} \, \sum _{t = 1}^T \, \epsilon _t^2 \, {{\hat{h}}}_{t m}^{- 2}\, {{\hat{\tau }}}_{t m | t}^{ 2}]\}\\&= E\{[T^{- 1} \, \sum _{t = 1}^T \, {{\hat{h}}}_{t m}^{- 1}\, {{\hat{\tau }}}_{t m | t}]^{- 2}\, [T^{- 1} \, \sum _{t = 1}^T \, {{\hat{h}}}_{t m}\, \eta _t^2 \, {{\hat{h}}}_{t m}^{- 2}\, {{\hat{\tau }}}_{t m | t}^{ 2}]\}\\&= E\{[T^{- 1} \, \sum _{t = 1}^T \, {{\hat{h}}}_{t m}^{- 1}\, {{\hat{\tau }}}_{t m | t}]^{- 2}\, [T^{- 1} \, \sum _{t = 1}^T \, {{\hat{h}}}_{t m}^{- 1}\, {{\hat{\tau }}}_{t m | t}^{ 2}]\}\\&= E[T^{- 1} \, \sum _{t = 1}^T \, {{\hat{h}}}_{t m}^{- 1}\, {\hat{\tau }}_{t m | t}]^{- 1} \, = \, \gamma _m^{- 1} \end{aligned} \end{aligned}$$

as \(E(\eta _t^2) = 1\) and \(E({{\hat{\tau }}}_{t m | t}^{ 2}) = E({\hat{\tau }}_{t m | t})\). To compute the asymptotic variance of \({\hat{\varvec{\theta }}}_{m}\) we set

$$\begin{aligned} {{{\mathcal {I}}}}_{m T} \, = \, T^{- 1}\, \sum _{t = 1}^T \, \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_m}|_{ {\varvec{\theta }}_m = {\hat{\varvec{\theta }}}_{m}}\, \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_{m}^{'}}|_{ {\varvec{\theta }}_m = {\hat{\varvec{\theta }}}_{m}} \, {{\hat{\tau }}}_{t m | t} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} {J}_{m T}&= \sum _{t = 1}^T \, (\eta _t^2 \, - \, 1)\, \frac{h_{t m}^0}{{{\hat{h}}}_{t m}} \, \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_m}|_{ {\varvec{\theta }}_m = {\hat{\varvec{\theta }}}_{m}}\, {{\hat{\tau }}}_{t m | t}\\&\quad + \, (\mu _m^0 \, - \, {{\hat{\mu }}}_m)^2 \, \sum _{t = 1}^T \, \frac{1}{{{\hat{h}}}_{t m}} \, \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_m}|_{ {\varvec{\theta }}_m = {\hat{\varvec{\theta }}}_{m}}\, {{\hat{\tau }}}_{t m | t}\\&\quad + \, 2\, (\mu _m^0 \, - \, {{\hat{\mu }}}_m)\, \sum _{t = 1}^T \, \frac{\sqrt{h_{t m}^0} \, \eta _t}{{{\hat{h}}}_{t m}} \, \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_m}|_{ {\varvec{\theta }}_m = {\hat{\varvec{\theta }}}_{m}}\, {{\hat{\tau }}}_{t m | t}. \end{aligned} \end{aligned}$$

Then we have

$$\begin{aligned} {\text {plim}}_{T \rightarrow \infty } \, {{{\mathcal {I}}}}_{m T} \, = \, \mathcal{I}_m \qquad {\text {plim}}_{T \rightarrow \infty } \, T^{- 1} \, {J}_{m T} \, {J}_{m T}^{'}\, = \, 2 \, {{{\mathcal {I}}}}_m. \end{aligned}$$

So it follows

$$\begin{aligned} \begin{aligned} {\text {var}}_{\infty } [\sqrt{T} \, ({\hat{\varvec{\theta }}}_m \, - \, {\varvec{\theta }}_m^0)]&= T E\left[ ({\hat{\varvec{\theta }}}_m \, - \, {\varvec{\theta }}_m^0)\, ({\hat{\varvec{\theta }}}_m \, - \, {\varvec{\theta }}_m^0)^{'}\right] \\&= E\left[ {{{\mathcal {I}}}}_{m T}^{- 1}\, T^{- 1} \, J_{m T}\, J_{m T}^{'} \, {{{\mathcal {I}}}}_{m T}^{- 1}\right] \\&= \, 2 \, {{{\mathcal {I}}}}_{m}^{- 1} \, {{{\mathcal {I}}}}_m \, {{{\mathcal {I}}}}_{m}^{- 1} \, = \, 2 \, {{{\mathcal {I}}}}_{m}^{- 1}. \end{aligned} \end{aligned}$$

Derivation of (14). The derivative \(\frac{\partial \, S_t ({\varvec{\theta }}_m)}{\partial \, {\varvec{\theta }}_{m}^{'}}\) is given by

$$\begin{aligned} \begin{aligned} \frac{\partial \, S_t ({\varvec{\theta }}_m)}{\partial \, {\varvec{\theta }}_{m}^{'}} \,&= \, \frac{1}{2} \, \left[ - \, \epsilon _t^2 \, h_{t m}^{- 2}\, \frac{\partial \, h_{t m}}{\partial \, {\varvec{\theta }}_m}\right] \, \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_{m}^{'}}\, {\tau }_{t m | t}\\&\quad + \, \frac{1}{2} \, \left[ \frac{\epsilon _t^2}{h_{t m}}\, - \, 1\right] \, \frac{\partial ^2 \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_m \, \partial \, {\varvec{\theta }}_{m}^{'}}\, {\tau }_{t m | t}\\&\quad + \, \frac{1}{2} \, \left[ \frac{\epsilon _t^2}{h_{t m}}\, - \, 1\right] \, \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_m}\, \frac{\partial \, \tau _{t m | t}}{ \partial \, {\varvec{\theta }}_{m}^{'}}. \end{aligned} \end{aligned}$$

Now we can approximate \( \frac{\partial ^2 \, {{{\mathcal {L}}}}({\varvec{\lambda }})}{\partial \, {\varvec{\theta }}_m \, \partial \, {\varvec{\theta }}_{m}^{'}}\) by taking the expectation of \(\frac{\partial \, S_t ({\varvec{\theta }}_m)}{\partial \, {\varvec{\theta }}_{m}^{'}}\) conditional on \(\mathbf{Y}_{t - 1}\). Recall that conditional on \(\mathbf{Y}_{t - 1}\) the magnitudes of \(h_{t m}\), \(\tau _{t m | t}\), and \( \frac{\partial \, \tau _{t m | t}}{ \partial \, {\varvec{\theta }}_{m}^{'}}\) are nonstochastic. Furthermore, we have

$$\begin{aligned} E(\epsilon _t^2 | \mathbf{Y}_{t - 1}) \, = \, E(h_{t m}\, \eta _t^2 | \mathbf{Y}_{t - 1}) \, = \, E(h_{t m} | \mathbf{Y}_{t - 1}) \, = \, h_{t m}. \end{aligned}$$

Then the conditional expectation of the second and third summand in \(\frac{\partial \, S_t ({\varvec{\theta }}_m)}{\partial \, {\varvec{\theta }}_{m}^{'}}\) vanish. So the conditional expectation of the first summand reduces to

$$\begin{aligned} \begin{aligned} E\left[ \frac{\partial \, S_t ({\varvec{\theta }}_m)}{\partial \, {\varvec{\theta }}_{m}^{'}} | \mathbf{Y}_{t - 1}\right]&= - \frac{1}{2} \, E\left[ \frac{\epsilon _t^2}{h_{t m}} \, \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_m} \, \frac{\partial \,{\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_{m}^{'}} \, { \tau }_{t m | t} | \mathbf{Y}_{t - 1}\right] \\&= \, - \, \frac{1}{2} \, E\left( \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_m} \, \frac{\partial \,{\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_{m}^{'}} \, { \tau }_{t m | t} | \mathbf{Y}_{t - 1}\right) \\&\approx \, - \, \frac{1}{2 T} \, \sum _{t = 1}^T \, \frac{\partial \, {\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_m} \, \frac{\partial \,{\text {ln}} \, h_{t m}}{\partial \, {\varvec{\theta }}_{m}^{'}} \, { \tau }_{t m | t}. \end{aligned} \end{aligned}$$

Derivation of (15). The derivative \(\frac{\partial \, S_t ({\pi }_m)}{\partial \, {\pi }_{m}}\) is given by

$$\begin{aligned} \begin{aligned} \frac{\partial \, S_t ({\pi }_m)}{\partial \, {\pi }_{m}}&= \, - \, \pi _{m}^{- 2} \, \tau _{t m | t} \, + \, \pi _{m}^{- 1} \, \frac{\partial \, \tau _{t m | t}}{\partial \, \pi _m} \\&= - \, \pi _{m}^{- 2} \, \tau _{t m | t} \, + \, \pi _{m}^{- 1} \, [\pi _{m}^{- 1}\, \tau _{t m | t} \, - \, \pi _{m}^{- 1}\, \tau _{t m | t}^{2}] \, = \, \, - \, \pi _{m}^{- 2} \, \tau _{t m | t}^2. \end{aligned} \end{aligned}$$

Hence it follows

$$\begin{aligned} - \, \frac{\partial ^2 \, {{{\mathcal {L}}}}({\varvec{\lambda }} | \mathbf{Y}_T)}{\partial ^2 \, \pi _m} \approx E\left[ - \, \frac{\partial ^2 \, {{{\mathcal {L}}}}({\varvec{\lambda }} | \mathbf{Y}_T)}{\partial ^2 \, \pi _m} | \mathbf{Y}_{t - 1}\right] \approx \, \pi _{m}^{- 2} \, \frac{1}{T} \, \sum _{t = 1}^T \, \tau _{t m | t}^2. \end{aligned}$$

Derivation of (16). The derivative \(\frac{\partial \, S_t ({\mu }_m)}{\partial \, {\mu }_{m}}\) is given by

$$\begin{aligned} \frac{\partial \, S_t ({\mu }_m)}{\partial \, {\mu }_{m}} = \frac{\partial \,}{\partial \, \mu _m} \left[ h_{t m}^{- 1} \, (y_t \, - \, \mu _m) \, \tau _{t m | t}\right] \approx - \, h_{t m}^{- 1}\, \tau _{t m | t}, \end{aligned}$$

hence

$$\begin{aligned} - \, \frac{\partial ^2 \, {{{\mathcal {L}}}}({\varvec{\lambda }} | \mathbf{Y}_T)}{\partial ^2 \, \mu _m} \approx E\left[ - \, \frac{\partial \, S_t ({\mu }_m)}{\partial \, {\mu }_{m}} | \mathbf{Y}_{t - 1}\right] \approx \, \frac{1}{T} \sum _{t = 1}^T\, h_{t m}^{- 1}\, \tau _{t m | t}. \end{aligned}$$

Derivation of (18). The last term of

$$\begin{aligned} {{{\mathcal {L}}}}({\hat{\varvec{\lambda }}} | \mathbf{Y}_T) \, = \, \sum _{t = 1}^T \, \sum _{m = 1}^M \, \left[ {\text {ln}}\, {{\hat{\pi }}}_m \, - \, \frac{1}{2} \, {\text {ln}}\, {{\hat{h}}}_{t m} \, - \, \frac{1}{2} \, \frac{{{\hat{\epsilon }}}_t^2}{{{\hat{h}}}_{t m}}\right] \, {{\hat{\tau }}}_{t m | t} \end{aligned}$$

can be approximated by T/2 for T sufficiently large. In fact, we have

$$\begin{aligned} \begin{aligned} \frac{1}{2} \, \sum _{t = 1}^T \, \sum _{m = 1}^M \, \frac{{{\hat{\epsilon }}}_t^2}{{{\hat{h}}}_{t m}} \, {{\hat{\tau }}}_{t m | t}&= \frac{1}{2} \, \sum _{t = 1}^T \, \sum _{m = 1}^M \, \eta _t^2 \, {{\hat{\tau }}}_{t m | t} \\&= \frac{1}{2} \, \sum _{t = 1}^T \, \eta _t^2 \, \sum _{m = 1}^M \, {{\hat{\tau }}}_{t m | t}\\&\, = \, \frac{1}{2} \, \sum _{t = 1}^T \, \eta _t^2 \\&\, = \, \frac{1}{2} \, T \, \frac{1}{T} \, \sum _{t = 1}^T \, \eta _t^2 \end{aligned} \end{aligned}$$

as \({{\hat{\epsilon }}}_t^2 \, = \, {{\hat{h}}}_{t m} \, \eta _t^2\) and \(\sum _{m = 1}^M \, {{\hat{\tau }}}_{t m | t} = 1\). For T large the term \(T^{- 1} \, \sum _{t = 1}^T \, \eta _t^2\) converges to \(E(\eta _t^2) = 1\), hence \( \frac{1}{2} \, \sum _{t = 1}^T \, \sum _{m = 1}^M \, \frac{{{\hat{\epsilon }}}_t^2}{{{\hat{h}}}_{t m}} \, {\hat{\tau }}_{t m | t} \) can be approximated by T/2 for T sufficiently large. By using (10), we get

$$\begin{aligned} \begin{aligned} {{{\mathcal {L}}}}({\hat{\varvec{\lambda }}} | \mathbf{Y}_T) \,&= \, \sum _{t = 1}^T \, \sum _{m = 1}^M \, {{\hat{\tau }}}_{t m | t}\, {\text {ln}}\, {{\hat{\pi }}}_m \, - \, \frac{1}{2} \, \sum _{t = 1}^T \, \sum _{m = 1}^M \, {{\hat{\tau }}}_{t m | t}\, {\text {ln}} \, {{\hat{h}}}_{t m} \, - \, \frac{T}{2} \\&= \sum _{m = 1}^M \, \left[ \sum _{t = 1}^T {{\hat{\tau }}}_{t m | t}\right] \, {\text {ln}}\, {{\hat{\pi }}}_m \, - \, \frac{1}{2}\, \sum _{t = 1}^T \sum _{m = 1}^M {{\hat{\tau }}}_{t m | t} \,{\text {ln}} \, {{\hat{h}}}_{t m} \, - \, \frac{T}{2} \\&= T \, \sum _{m = 1}^M \left[ \frac{1}{T} \, \sum _{t = 1}^T {{\hat{\tau }}}_{t m | t}\right] \, {\text {ln}} \, {{\hat{\pi }}}_m \, - \, \frac{1}{2}\, \sum _{t = 1}^T \, \sum _{m = 1}^M\, {{\hat{\tau }}}_{t m | t}\,{\text {ln}} \, {{\hat{h}}}_{t m} \, - \, \frac{T}{2} \\&= T \sum _{m = 1}^M \, {{\hat{\pi }}}_m\, {\text {ln}} \, {\hat{\pi }}_m \, - \, \frac{1}{2}\, \sum _{t = 1}^T \, \sum _{m = 1}^M\, {{\hat{\tau }}}_{t m | t}\,{\text {ln}} \, {{\hat{h}}}_{t m} \, - \, \frac{T}{2} \end{aligned} \end{aligned}$$

for T sufficiently large. Then formula (18) follows.

Representing a MGARCH by a MS ARMA model. A MGARCH(Mpq) model can be represented by a M-state Markov switching ARMA(rp) process, where \(r = \max (p, q)\). Set \(\epsilon _t^2 = h_{t, s_t} + v_t\). Then we have

$$\begin{aligned} \begin{aligned} \epsilon _t^2&\, = \, h_{t, s_t} \, + \, v_t = \omega _{s_t} \, + \, \sum _{i = 1}^q\, \alpha _{i, s_t}\, \epsilon _{t - i}^{2} \, + \, \sum _{j = 1}^p\, \beta _{j, s_t}\, h_{t - j, s_t} \, + \, v_t \\&= \omega _{s_t} \, + \, \sum _{i = 1}^q\, \alpha _{i, s_t}\, \epsilon _{t - i}^{2} \, + \, \sum _{j = 1}^p\, \beta _{j, s_t}\, (\epsilon ^{2}_{t - j} \, - \, v_{t - j}) \, + \, v_t \\&= \omega _{s_t} \, + \, \sum _{k = 1}^r\, \delta _{k, s_t}\, \epsilon _{t - k}^{2} \, + \, v_t \, - \, \sum _{j = 1}^p\, \beta _{j, s_t}\, v_{t - j} \end{aligned} \end{aligned}$$

where \(\delta _{k, s_t} = \alpha _{k, s_t} \, + \, \beta _{k, s_t}\). Here we set \(\alpha _{k, s_t} = 0\) for \(k > q\) and \(\beta _{k, s_t} = 0\) for \(k > p\). For t sufficiently large, the disturbance process \(v_t\) can be approximately driven by a normal distribution with zero mean and variance

$$\begin{aligned} \begin{aligned} E(v_{t}^2)&\, = \, E[(\epsilon _t^2 \, - \, h_{t, s_t})^2] \, = \, E[(h_{t, s_t} \, \eta _t^2 \, - \, h_{t, s_t})^2] \\&\, = \, E[h_{t, s_t}^{2} \, (\eta _t^2 \, - \, 1)^2] \, = \, E(h_{t, s_t}^{2})\, E[(\eta _t^2 \, - \ 1)^2]\\&\, = \, 2\, E(h_{t, s_t}^{2}) \end{aligned} \end{aligned}$$

as \(E[(\eta _t^2 \, - \, 1)^2] = E(\eta _t^4) \, - \, 2\, E(\eta _t^2) \, + \, 1 \, = \, 2\) for \(\eta _t \sim N(0, 1)\). The explicit computation of unconditional moments of the powers of \(\epsilon _t^2\) and \(h_{t, s_t}\) can be found in Francq and Zakoïan (2008). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cavicchioli, M. Statistical inference for mixture GARCH models with financial application. Comput Stat 36, 2615–2642 (2021). https://doi.org/10.1007/s00180-021-01092-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-021-01092-5

Keywords

Navigation