A Bayesian analysis of moving average processes with time-varying parameters

https://doi.org/10.1016/j.csda.2007.04.001Get rights and content

Abstract

A new Bayesian method is proposed for estimation and forecasting with Gaussian moving average (MA) processes with time-varying parameters. The focus is placed on MA models of order one, but a general result is given for an MA process of an arbitrary known order. A multiplicative model for the evolution of the squares of the parameters is introduced following Bayesian conjugacy through beta and truncated gamma distributions and a discount factor. Two new distributions are proposed providing the prior and posterior distributions of the parameters of the model and the one-step forecast distribution of the process. Several well-known distributional results are extended by replacing the gamma distribution with the truncated gamma distribution. The proposed methodology is illustrated with two examples consisting of simulated data and of aluminium spot prices of the London metal exchange.

Introduction

In the recent years a large amount of work has been devoted to time series analysis, with the focus placed on stationary auto-regressive and moving average processes (Box et al., 1994, Chatfield, 1996). From a Bayesian standpoint Shaarawy (1984), DeJong and Whiteman (1993), and Barnett et al. (1997) develop estimation procedures for moving average processes with time-invariant parameters and their work is based on iterative estimation and in particular on Monte Carlo simulation. With the development of state space methods in time series (Harvey, 1989, West and Harrison, 1997, Durbin and Koopman, 2001), the need for time-varying parameters has been raised, because such parameters can be adaptive as new information is received. For example, it is widely recognized that the volatility of an asset changes over time and this observation has resulted in a wide range of work in financial time series (Tsay, 2002). Although, state space models and, in particular, auto-regressive models with time-varying parameters (TVAR) have been discussed in the literature (West and Harrison, 1997, Section 9.6; Foschi et al., 2003, Koopman and Ooms, 2006), the development of time-varying moving average models (TVMA) is not so well documented.

TVAR models have been developed as in Kitagawa and Gersch (1996), Dahlhaus (1997), West et al., 1999a, West et al., 1999b, Prado et al. (2001), Prado and Huerta (2002), Bibi and Francq (2003), Francq and Gautier (2004), and Anderson and Meerschaert (2005). While West et al., 1999a, West et al., 1999b, Prado et al. (2001), and Prado and Huerta (2002) use a state space representation of the TVAR model (see also Section 2) to obtain Bayesian estimators, Kitagawa and Gersch (1996), Dahlhaus (1997), Francq and Gautier (2004), and Anderson and Meerschaert (2005) develop asymptotic theory for TVAR models and thus they obtain estimators that have desirable asymptotic properties (e.g. consistency or efficiency). This work is based on the assumption of local stationarity (Dahlhaus, 1997, Nason et al., 2000; Francq and Zakoan, 2001; Mercurio and Spokoiny, 2004); there are several time-intervals, called regimes, in each of which the process is assumed to exhibit weak stationary. For a formal definition of local stationarity the reader is referred to Dahlhaus (1997). Local stationarity is also known as periodical stationarity when the above regimes have all the same length (Anderson and Meerschaert, 2005); when the regimes have different length, the term of non-periodical stationarity is used and it is said that changes in the time series dynamics occur at irregular intervals of time (Francq and Gautier, 2004). In either case, parameter estimation is based on asymptotic theory, but for short-term forecasting and, in particular, in the presence of time series data of short length, the asymptotic estimators might not be suitable, because their properties are based on large samples.

Although in theory a TVMA may be written as an infinite TVAR process, in practice this method does not work very well for several reasons. These relate to the order of the TVAR process and to the question of how implementable a high-order TVAR process may be. Especially for a required low-order TVMA, it may be inefficient to fit, say a TVAR(100) model. For this reason we deem that an analysis for TVMA of order one is important and in this paper we concentrate on this model. It should be mentioned that a TVMA model can be written in state space form, but if the modeller wishes to allow the time-varying parameters to admit a distribution, this necessarily defines a state space model with stochastic design components. We believe that estimation of such a model is not easy and it will definitely need to resort to Monte Carlo or other simulation methodology. In Section 2 we comment on this state space representation and its difficulties in estimation.

This paper develops a new Bayesian algorithm for estimation and forecasting with moving average processes with time-varying parameters. The focus is placed on TVMA models of order one, but we outline the development of more general moving average processes. The basic idea is to construct a full distributional model for the response variable conditional on the parameters together with an evolution model for the squares of the parameters. This is implemented as if the squares of the parameters were following a stochastic variance law, which is well specified via beta and truncated gamma distributions and a discount factor. Then in updating we obtain a conjugate model for which estimation and forecasting is developed by generalizing several distributional results of the normal/gamma and gamma/beta conjugacy. The posterior distribution of the TVMA parameters is an interesting new distribution, for which we discuss some properties. Although we are, exclusively, dealing with time-varying parameters, the proposed methodology can be applied for ordinary moving average processes, just equalizing all the moving average parameters over time. This can be done routinely by setting the discount factor, which measures the parameters dispersion, just to one. Our proposal of a conjugate analysis for TVMA processes aims to introduce a new Bayesian algorithm, which is fast and it does not rely on approximations, asymptotic theory, or simulation.

The paper is organized as follows. Section 2 gives a state space representation of TVAR and TVMA models and it outlines some of the difficulties in the estimation of TVMA processes. Section 3 develops the main Bayesian algorithm for TVMA time series of order one. Section 4 introduces a symmetric distribution, which provides the square root of inverted gamma and truncated inverted gamma distributions and we discuss some of the properties of this new distribution. In Section 5 the posterior distribution of the TVMA parameters is derived. The predictive distribution of the process is discussed in Section 6 and Section 7 discusses how the discount factor is chosen. Section 8 develops the main theory for TVMA processes of any arbitrarily, but known, order and Section 9 illustrates the proposed methodology by considering two examples consisting of simulated data and aluminium spot prices data from the London metal exchange. Section 10 summarizes and comments on the main findings and it outlines how the process mean can be estimated. Finally, in the appendix, some well-known results related with gamma distributions are extended to their truncated versions and the Bayesian conjugacy between truncated gamma and normal distributions is developed.

Section snippets

State space representation of moving average models

Let {Xt} be a sequence of observations, observed in roughly equal intervals of time. The TVAR(p) model is defined by Xt=φ1tXt-1+φ2tXt-2++φptXt-p+εt,where t>p, φ1t,,φpt are the p time-varying AR parameters and εtN(0,Σt). Here the volatility Σt is assumed known, but its extension to unknown and stochastic Σt is easy (West and Harrison, 1997). We can now put the above model into a state space form by writing Xt=Ftθt+εt, where Ft=[Xt-1Xt-p] and θt=[φ1tφpt]. West and Harrison (1997), and

Motivation

In order to motivate the development of the TVMA model, first we discuss briefly the case of moving average process of order 1 (MA(1)), for which the parameter α is time-invariant and the process {Xt} is defined byXt=ξt+αξt-1,where ξt are i.i.d. innovations, each following the normal distribution N(0,V), for a variance V. If V is known, one can simplify model (1) by defining the process Yt=ζt+αζt-1, where Yt=Xt/V and ζt=ξt/V, so that ζtN(0,1). If V is unknown, which will be the case in many

The inverted gamma square root distribution

Lemma 1

Let X be a real random variable and Γm(a,b) denote the incomplete gamma function with argument m>0. Thenp(x)=c2a|x|Γ1/d(a,b)(x2+c2d)a+1exp-c2bx2+c2d,where a,b,d0 and cR, is a density function.

Proof

Consider the transformation Y=c2/(X2+c2d) so that xR, it is y(0,1/d). Then -+p(x)dx=20+p(x)dx=1Γ1/d(a,b)01/dya-1exp{-by}dy=1,since Y has a truncated gamma distribution G1/d(a,b). 

Corollary 1

If Y=c2/(X2+c2d)G1/d(a,b), where a,b,c,d are as in Lemma 1, then X follows the distribution of Lemma 1. If in

Estimating the parameters αt

In this section we give the prior distribution of αt|xt-1 and the posterior distribution of αt|xt. First we consider the prior distribution. From Eq. (12) we have that 1Vt+αt2Atxt-1G1/Vtδnt-12,δnt-1St-12,where nt-1 and St-1 are known at time t-1.

Let A^t be the evaluation of At at αt-1=α^t-1, where α^t-1 is any of the two modes of the posterior distribution of αt-1|xt-1. Note that At is a function of αt-12 and so both modes of αt-1|xt-1 return the same value for A^t. From Corollary 1, if we set

One-step ahead forecast distribution

Recall that Eq. (5) provides the one-step forecast distribution of Xt conditional on αt and αt-1, i.e.Xt|αt,αt-1,xt-1NαtVt-1xt-1Vt-1+αt-12Vt-2,Vt+αt2At.Firstly, we derive the mean and variance of Xt conditional on information xt-1 and conditional on a non-zero estimate α^t-10 of αt-1. From (22) and (20), using conditional expectations we have E(Xt|αt-1=α^t-1,xt-1)=E{E(Xt|αt,αt-1=α^t-1,xt-1)|xt-1}=Vt-1E(αt|xt-1)Vt-1+α^t-1Vt-1=0.From (21) and (22) we obtain the forecast variance asVar(Xt|αt-1=α^

Choice of V0 and δ

The above analysis is based on the specification of V0 and δ. In this section we briefly discuss how this specification can be carried out.

Section 3.1 discusses the rationale of the choice of Vt of Eq. (3). V0 is a prior variance estimate of ξ0 and it can be set either by using historical data or in combination of δ to achieve a confidence region of the variance Vt. For example, large values of V0 together with a high discount factor δ, will delay a rapid decrease of Vt. It can be possible to

The moving average process of order q

Consider the general moving average process of order q, for any q1, defined byXt=ξt+α1tξt-1++αqtξt-q=ξt+i=1qαitξt-i,where a priori the innovations ξt are independent of each following Gaussian distributions, ξtN(0,Vt) and Vt is specified as in Eq. (3).

Write αt=[α1tα2tαqt] and Ξt=[ξtξt-1ξt-q] so that Xt=[1αt]Ξt. We denote with Np(·,·) the p-variate Gaussian distribution and, for consistency with the previous sections, we adopt the convention N1(·,·)N(·,·). We can see that Cov(Xt,Ξt|αt)=

Simulated data

We have generated a single time series {xt}t=1,2,,200 from the TVMA model (4) with V0=1 and δ=0.99. The data are plotted in Fig. 3, which also shows the one-step forecast means. The discount factor δ=0.99 is responsible for a change in the variance of the series {xt}, which is centred around zero, and this change in the variance is clearly indicated in Fig. 3 by the vertical line plotted at t=93. Although, the exponential decay is responsible for a smooth decrease in the variance of the

Concluding comments

This paper considers the problem of estimation and forecasting with moving average processes with time-varying parameters. The focus is placed on processes of order one, but a general algorithm is proposed for moving average processes of order of any known positive integer.

Model (6) in Section 3 provides the evolution from time t-1 to t of the parameters αt. The proposed procedure depends on a discount factor and on the beta distribution assumption of γt. The method makes use of truncated gamma

Acknowledgments

We would like to thank Norman Johnson for making useful comments and suggestions on Section 4. We are also grateful to three anonymous referees who offered valuable suggestions, which improved the paper.

References (41)

  • C. Chatfield

    The Analysis of Time Series

    (1996)
  • C.S. Coffey et al.

    Properties of doubly-truncated gamma variables

    Comm. Statist. Theory Methods

    (2000)
  • R. Dahlhaus

    Fitting time series models to nonstationary processes

    Ann. Statist.

    (1997)
  • P. Damien et al.

    Sampling truncated normal, beta, and gamma densities

    J. Comput. Graph. Statist.

    (2001)
  • D.N. DeJong et al.

    Estimating moving average parameters: classical pileups and Bayesian posteriors

    J. Bus. Econom. Statist.

    (1993)
  • J. Durbin et al.

    Time Series Analysis by State Space Methods

    (2001)
  • R. Fildes

    The evaluation of extrapolative forecasting methods

    Int. J. Forecasting

    (1992)
  • D.A. Foschi et al.

    A comparative study of algorithms for solving seemingly unrelated regression models

    Comput. Statist. Data Anal.

    (2003)
  • C. Francq et al.

    Large sample properties of parameter least squares estimates for time-varying ARMA models

    J. Time Ser. Anal.

    (2004)
  • A.C. Harvey

    Forecasting, Structural Time Series Models and the Kalman Filter

    (1989)
  • Cited by (7)

    • Detailed investigation of discrepancies in Köppen-Geiger climate classification using seven global gridded products

      2022, Journal of Hydrology
      Citation Excerpt :

      Several options including, weighting methods, objective analysis, and statistical methods are available in the literature. Weighting methods such as Bayesian Moving Average (BMA) (Triantafyllopoulos and Nason, 2007) are not applicable because there are no independent station data to evaluate the weight from the many stations used in gauge-based products (e.g., CRU). In terms of the objective analysis methods, the Triple collocation method (Stoffelen, 1998) is also not applicable for two reasons: first, it only accepts three input datasets, while we use seven global products, and second, its input datasets must be independent, whereas our gauge-based products have many overlapping data.

    • The exact Gaussian likelihood estimation of time-dependent VARMA models

      2016, Computational Statistics and Data Analysis
    • 2nd Special Issue on Statistical Signal Extraction and Filtering

      2007, Computational Statistics and Data Analysis
    • Phase II control charts for autocorrelated processes

      2016, Quality Technology and Quantitative Management
    View all citing articles on Scopus
    View full text