MCMC algorithms for constrained variance matrices

doi:10.1016/j.csda.2005.02.008

Computational Statistics & Data Analysis

Volume 50, Issue 7, 1 April 2006, Pages 1655-1677

https://doi.org/10.1016/j.csda.2005.02.008 Get rights and content

Abstract

The problem of finding a generic algorithm for applying Markov chain Monte Carlo (MCMC) estimation procedures to statistical models that include variance matrices with additional parameter constraints is considered. Such problems can be split between additional constraints across variance matrices and within variance matrices. The case of additional constraints across variance matrices is considered here for the first time and a review of existing work on the case of additional parameter constraints within a variance matrix is given. Two simple single-site updating random walk Metropolis algorithms are described which have the advantage of generality in that they can be applied to virtually all scenarios. Four applications where these methods can be used in practice are given. Some situations when such single-site algorithms break down are described and multiple-site alternatives are briefly discussed.

Introduction

Markov chain Monte Carlo (MCMC) methods have become familiar methods for estimating complex statistical models since Gelfand and Smith (1990) introduced the use of the Gibbs sampler (Geman and Geman, 1984) into statistical modelling. There have since been many enhancements to the standard Gibbs sampler for sampling efficiently from posteriors that do not have standard forms, for example adaptive rejection sampling (Gilks and Wild, 1992). Gibbs sampling algorithms are actually a special case of the more general Metropolis and Hastings algorithms (Metropolis et al., 1953, Hastings, 1970). For a review of the Metropolis–Hastings algorithm with details of several algorithms not covered here, see Chib and Greenburg (1995).

In this paper we will consider alternative Metropolis and Hastings methods for updating variance matrices within statistical models. Variance matrices can be thought of as constrained sets of parameters and Gelfand et al. (1992) give an elegant method for dealing with constrained posterior distributions. This method however requires that the constraint truncation points are easy to calculate and that the equivalent unconstrained posterior distribution is easy to simulate from.

Here, we will start by considering models which contain scalar variance parameters and variance matrices which do not exhibit additional parameter constraints. We discuss several Metropolis–Hastings methods which can be used instead of the standard Gibbs sampling method in these situations. In particular, we describe two Metropolis–Hastings methods that can then also be used when the variance parameters exhibit additional parameter constraints. One method is based on truncated Normal proposals and requires that the truncation points in the constrained distribution need to be evaluated at each iteration; the second method is based on Normal random walk Metropolis sampling and simply requires a test that the proposed variance matrices are proper. All of the Metropolis–Hastings methods introduced in this paper use proposal distributions that can be tuned by adapting methods similar to those introduced in Browne and Draper (2000). Here, the method is run for an adapting period prior to burn-in in which the proposal distributions are adjusted to produce proposals that have a particular acceptance rate desired by the user.

The structure of the paper is as follows. In Section 2 we give a brief background to MCMC sampling and the Metropolis–Hastings (MH) sampling algorithm and define the terminology that we will use in the rest of the paper. In Section 3 we consider how to sample from scalar variance parameters and variance matrices which do not have constraints. In Section 4 we consider the case of constraints across variance matrices and give two examples of where such constraints occur in practice. Then in Section 5 we consider the case of constraints within variance matrices. These types of constraints have already been considered in the literature by several authors using more efficient MCMC algorithms and so our objective here is to show the generality and limitations of our simple methods. We again give two examples where such constraints occur and highlight in the second of these examples some of the limitations of the MH methods. We end with some general discussions and possible extensions.

Section snippets

MCMC sampling and the Metropolis–Hastings algorithm

MCMC methods have revolutionised the applicability of Bayesian inference to statistical modelling. Let us imagine a modelling scenario where we have some known data Y and some unknown parameters $θ$ with a statistical model that links the data and the parameters. Then in Bayesian inference we are interested in finding the joint posterior distribution of the parameters given the data $p (θ | Y)$ . The difficulty is that this distribution is typically found by multi-dimensional integration which is only

Alternative methods for unconstrained variances

In many statistical models the set of unknown parameters will contain either scalar variance parameters or perhaps a variance matrix. If a conjugate prior is used for these parameters then Gibbs sampling steps are often used as the variances have scaled inverse $χ^{2}$ or inverse Wishart (IW) posterior distributions, both of which can be easily sampled from. In this section we will consider what MH methods we could use instead and what loss of efficiency will result.

For the purpose of comparison we

Constraints across variance matrices

We mentioned earlier that constrained variance matrices can be split into 2 classes. In this section we consider the less explored area of constraints across variance matrices. In fact although Browne et al. (2002) have studied the scalar variance case, we believe that the case of constraints across variance matrices has not previously been considered using MCMC sampling.

As an illustration assume we have two variance matrices $Ω_{1} = (\begin{matrix} θ_{1} & θ_{2} \\ θ_{2} & θ_{3} \end{matrix}) and Ω_{2} = (\begin{matrix} θ_{1} & θ_{4} \\ θ_{4} & θ_{5} \end{matrix}) .$ Then these matrices are constrained by the

Constraints within variance matrices

Variance matrices are always constrained by the fact that they have to form positive definite matrices. In this section we consider matrices that have additional constraints that affect the elements of the matrix. This usually occurs in one of the two ways: either by certain elements of the matrix being fixed constants as in Example 3 that follows, or by elements being functions of other elements as in Example 4. All of these models have been discussed by other authors and so we include the

Discussion

In this paper we have given several algorithms for updating variance matrices within an MCMC framework. We have considered MH-based steps that can be used as an alternative to the standard Gibbs sampling IW or inverse gamma updates. We have then shown how two of these methods can be used when the variance matrices are subject to additional constraints. We have identified two distinct types of additional constraint; across variance constraints and within variance constraints.

The key to these

Conclusions

This paper has considered the problem of finding a general MCMC method that can be used to fit a large family of models that contain variance matrices with additional parameter constraints. Our motivation is to find a generic method that is applicable to as many models as possible, and is not to find the most efficient method for particular specific model groups. The paper considers two simple single-site updating Metropolis–Hastings steps that will bolt easily into MCMC algorithms. Some

References (28)

W.J. Browne et al.
A comparison of methods for fitting multilevel models with complex level 1 variation
Comput. Statist. Data Anal.
(2002)
S. Chib
Bayes inference in the tobit censored regression model
J. Econometrics
(1992)
S. Chib et al.
Bayesian analysis of cross-sectional and clustered data selection models
J. Econometrics
(2000)
J. Albert et al.
Bayesian analysis of binary and polychotomous response data
J. Amer. Statist. Assoc.
(1993)
Browne, W.J., 1998. Applying MCMC methods to multilevel models. Ph.D. Dissertation, Department of Mathematical...
Browne, W.J., 2003. MCMC Estimation in MLwiN. Institute of Education, University of London,...
W.J. Browne et al.
Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models
Comput. Statist.
(2000)
S. Chib et al.
Understanding the Metropolis–Hastings algorithm
Amer. Statist.
(1995)
S. Chib et al.
Analysis of multivariate probit models
Biometrika
(1998)
Draper, D., Cheal, R., 1997. Practical MCMC for assessment and propagation of model uncertainty. Unpublished Technical...

A.E. Gelfand et al.

Sampling based approaches to calculating marginal densities

J. Amer. Statist. Assoc.

(1990)

A.E. Gelfand et al.

Illustration of Bayesian inference in normal data models using Gibbs sampling

J. Amer. Statist. Assoc.

(1990)

A.E. Gelfand et al.

Bayesian analysis of constrained parameter and truncated data problems using Gibbs sampling

J. Amer. Statist. Assoc.

(1992)

A.E. Gelfand et al.

Efficient parameterizations for normal linear mixed models

Biometrika

(1995)

Cited by (33)

Stochastic weighted graphs: Flexible model specification and simulation
2017, Social Networks
Citation Excerpt :
The Metropolis–Hastings procedure that we propose samples the t + 1st network, x(t+1), via a truncated multivariate Gaussian proposal distribution q(·|x(t)) whose mean depends on the previous sample x(t) and whose variance is a fixed constant σ2. The truncated Gaussian is a convenient and commonly used proposal distribution for bounded random variables such as those on the [0, 1] interval with which we are working (see, e.g., Browne, 2006; Claeskens et al., 2010; Müller, 2010; Neelon et al., 2014; Franks et al., 2015). The advantage of the truncated Gaussian over the obvious alternative for bounded random variables – the Beta distribution – is that it is straightforward to concentrate the density of the truncated Gaussian around any point within the bounded range.
In most domains of network analysis researchers consider networks that arise in nature with weighted edges. Such networks are routinely dichotomized in the interest of using available methods for statistical inference with networks. The generalized exponential random graph model (GERGM) is a recently proposed method used to simulate and model the edges of a weighted graph. The GERGM specifies a joint distribution for an exponential family of graphs with continuous-valued edge weights. However, current estimation algorithms for the GERGM only allow inference on a restricted family of model specifications. To address this issue, we develop a Metropolis–Hastings method that can be used to estimate any GERGM specification, thereby significantly extending the family of weighted graphs that can be modeled with the GERGM. We show that new flexible model specifications are capable of avoiding likelihood degeneracy and efficiently capturing network structure in applications where such models were not previously available. We demonstrate the utility of this new class of GERGMs through application to two real network data sets, and we further assess the effectiveness of our proposed methodology by simulating non-degenerate model specifications from the well-studied two-stars model. A working R version of the GERGM code is available in the supplement and is incorporated in the GERGM CRAN package.
Multiple Imputation and its Application
2023, Multiple Imputation and its Application
Bayesian neural networks via MCMC: a Python-based tutorial
2023, arXiv
A Bayesian Approach to Estimating Reciprocal Effects with the Bivariate STARTS Model
2023, Multivariate Behavioral Research
A Mixture Response Time Process Model for Aberrant Behaviors and Item Nonresponses
2023, Multivariate Behavioral Research
Substantive model compatible multilevel multiple imputation: A joint modeling approach
2022, Statistics in Medicine

View all citing articles on Scopus

View full text

MCMC algorithms for constrained variance matrices

Abstract

Introduction

Section snippets

MCMC sampling and the Metropolis–Hastings algorithm

Alternative methods for unconstrained variances

Constraints across variance matrices

Constraints within variance matrices

Discussion

Conclusions

Comput. Statist. Data Anal.

J. Econometrics

J. Econometrics

Bayesian analysis of binary and polychotomous response data

J. Amer. Statist. Assoc.

Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models

Comput. Statist.

Understanding the Metropolis–Hastings algorithm

Amer. Statist.

Analysis of multivariate probit models

Biometrika

Sampling based approaches to calculating marginal densities

J. Amer. Statist. Assoc.

Illustration of Bayesian inference in normal data models using Gibbs sampling

J. Amer. Statist. Assoc.

Bayesian analysis of constrained parameter and truncated data problems using Gibbs sampling

J. Amer. Statist. Assoc.

Efficient parameterizations for normal linear mixed models

Biometrika