MCMC algorithms for constrained variance matrices
Introduction
Markov chain Monte Carlo (MCMC) methods have become familiar methods for estimating complex statistical models since Gelfand and Smith (1990) introduced the use of the Gibbs sampler (Geman and Geman, 1984) into statistical modelling. There have since been many enhancements to the standard Gibbs sampler for sampling efficiently from posteriors that do not have standard forms, for example adaptive rejection sampling (Gilks and Wild, 1992). Gibbs sampling algorithms are actually a special case of the more general Metropolis and Hastings algorithms (Metropolis et al., 1953, Hastings, 1970). For a review of the Metropolis–Hastings algorithm with details of several algorithms not covered here, see Chib and Greenburg (1995).
In this paper we will consider alternative Metropolis and Hastings methods for updating variance matrices within statistical models. Variance matrices can be thought of as constrained sets of parameters and Gelfand et al. (1992) give an elegant method for dealing with constrained posterior distributions. This method however requires that the constraint truncation points are easy to calculate and that the equivalent unconstrained posterior distribution is easy to simulate from.
Here, we will start by considering models which contain scalar variance parameters and variance matrices which do not exhibit additional parameter constraints. We discuss several Metropolis–Hastings methods which can be used instead of the standard Gibbs sampling method in these situations. In particular, we describe two Metropolis–Hastings methods that can then also be used when the variance parameters exhibit additional parameter constraints. One method is based on truncated Normal proposals and requires that the truncation points in the constrained distribution need to be evaluated at each iteration; the second method is based on Normal random walk Metropolis sampling and simply requires a test that the proposed variance matrices are proper. All of the Metropolis–Hastings methods introduced in this paper use proposal distributions that can be tuned by adapting methods similar to those introduced in Browne and Draper (2000). Here, the method is run for an adapting period prior to burn-in in which the proposal distributions are adjusted to produce proposals that have a particular acceptance rate desired by the user.
The structure of the paper is as follows. In Section 2 we give a brief background to MCMC sampling and the Metropolis–Hastings (MH) sampling algorithm and define the terminology that we will use in the rest of the paper. In Section 3 we consider how to sample from scalar variance parameters and variance matrices which do not have constraints. In Section 4 we consider the case of constraints across variance matrices and give two examples of where such constraints occur in practice. Then in Section 5 we consider the case of constraints within variance matrices. These types of constraints have already been considered in the literature by several authors using more efficient MCMC algorithms and so our objective here is to show the generality and limitations of our simple methods. We again give two examples where such constraints occur and highlight in the second of these examples some of the limitations of the MH methods. We end with some general discussions and possible extensions.
Section snippets
MCMC sampling and the Metropolis–Hastings algorithm
MCMC methods have revolutionised the applicability of Bayesian inference to statistical modelling. Let us imagine a modelling scenario where we have some known data Y and some unknown parameters with a statistical model that links the data and the parameters. Then in Bayesian inference we are interested in finding the joint posterior distribution of the parameters given the data . The difficulty is that this distribution is typically found by multi-dimensional integration which is only
Alternative methods for unconstrained variances
In many statistical models the set of unknown parameters will contain either scalar variance parameters or perhaps a variance matrix. If a conjugate prior is used for these parameters then Gibbs sampling steps are often used as the variances have scaled inverse or inverse Wishart (IW) posterior distributions, both of which can be easily sampled from. In this section we will consider what MH methods we could use instead and what loss of efficiency will result.
For the purpose of comparison we
Constraints across variance matrices
We mentioned earlier that constrained variance matrices can be split into 2 classes. In this section we consider the less explored area of constraints across variance matrices. In fact although Browne et al. (2002) have studied the scalar variance case, we believe that the case of constraints across variance matrices has not previously been considered using MCMC sampling.
As an illustration assume we have two variance matrices Then these matrices are constrained by the
Constraints within variance matrices
Variance matrices are always constrained by the fact that they have to form positive definite matrices. In this section we consider matrices that have additional constraints that affect the elements of the matrix. This usually occurs in one of the two ways: either by certain elements of the matrix being fixed constants as in Example 3 that follows, or by elements being functions of other elements as in Example 4. All of these models have been discussed by other authors and so we include the
Discussion
In this paper we have given several algorithms for updating variance matrices within an MCMC framework. We have considered MH-based steps that can be used as an alternative to the standard Gibbs sampling IW or inverse gamma updates. We have then shown how two of these methods can be used when the variance matrices are subject to additional constraints. We have identified two distinct types of additional constraint; across variance constraints and within variance constraints.
The key to these
Conclusions
This paper has considered the problem of finding a general MCMC method that can be used to fit a large family of models that contain variance matrices with additional parameter constraints. Our motivation is to find a generic method that is applicable to as many models as possible, and is not to find the most efficient method for particular specific model groups. The paper considers two simple single-site updating Metropolis–Hastings steps that will bolt easily into MCMC algorithms. Some
References (28)
- et al.
A comparison of methods for fitting multilevel models with complex level 1 variation
Comput. Statist. Data Anal.
(2002) Bayes inference in the tobit censored regression model
J. Econometrics
(1992)- et al.
Bayesian analysis of cross-sectional and clustered data selection models
J. Econometrics
(2000) - et al.
Bayesian analysis of binary and polychotomous response data
J. Amer. Statist. Assoc.
(1993) - Browne, W.J., 1998. Applying MCMC methods to multilevel models. Ph.D. Dissertation, Department of Mathematical...
- Browne, W.J., 2003. MCMC Estimation in MLwiN. Institute of Education, University of London,...
- et al.
Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models
Comput. Statist.
(2000) - et al.
Understanding the Metropolis–Hastings algorithm
Amer. Statist.
(1995) - et al.
Analysis of multivariate probit models
Biometrika
(1998) - Draper, D., Cheal, R., 1997. Practical MCMC for assessment and propagation of model uncertainty. Unpublished Technical...
Sampling based approaches to calculating marginal densities
J. Amer. Statist. Assoc.
Illustration of Bayesian inference in normal data models using Gibbs sampling
J. Amer. Statist. Assoc.
Bayesian analysis of constrained parameter and truncated data problems using Gibbs sampling
J. Amer. Statist. Assoc.
Efficient parameterizations for normal linear mixed models
Biometrika
Cited by (33)
Stochastic weighted graphs: Flexible model specification and simulation
2017, Social NetworksCitation Excerpt :The Metropolis–Hastings procedure that we propose samples the t + 1st network, x(t+1), via a truncated multivariate Gaussian proposal distribution q(·|x(t)) whose mean depends on the previous sample x(t) and whose variance is a fixed constant σ2. The truncated Gaussian is a convenient and commonly used proposal distribution for bounded random variables such as those on the [0, 1] interval with which we are working (see, e.g., Browne, 2006; Claeskens et al., 2010; Müller, 2010; Neelon et al., 2014; Franks et al., 2015). The advantage of the truncated Gaussian over the obvious alternative for bounded random variables – the Beta distribution – is that it is straightforward to concentrate the density of the truncated Gaussian around any point within the bounded range.
Multiple Imputation and its Application
2023, Multiple Imputation and its ApplicationA Bayesian Approach to Estimating Reciprocal Effects with the Bivariate STARTS Model
2023, Multivariate Behavioral ResearchA Mixture Response Time Process Model for Aberrant Behaviors and Item Nonresponses
2023, Multivariate Behavioral ResearchSubstantive model compatible multilevel multiple imputation: A joint modeling approach
2022, Statistics in Medicine