Block-band behavior of spatial correlations: An analytical asymptotic study in a spatial exponential family data setup

https://doi.org/10.1016/j.jmva.2021.104785Get rights and content

Abstract

There is a long history of spatial regression analysis where it is important to accommodate the spatial correlations among the responses from neighboring locations for any valid inferences. Among numerous modeling approaches, the so-called spatial auto-regression (SAR) model in a linear setup, and the conditional auto-regression (CAR) model in a binary setup, are widely used. For spatial binary analysis, there exists two other competitive approaches, namely the bivariate probit models (BPM) based composite likelihood approach using local lattices; and a ‘Working’ correlations based QL (quasi-likelihood) (WCQL) approach. These correlation models, however, fail to accommodate both within and between correlations among spatial families, where a spatial family is naturally formed within a threshold distance of a selected location, and the member locations between two neighboring families may also be correlated. In this paper, we exploit this latter two-ways, within and between correlations among spatial families and develop a unified correlation model for all exponential family based such as linear, count or binary data. We further exploit the proposed correlation structure based generalized quasi-likelihood (GQL) and method of moments (MM) approaches for model parameters estimation. As far as the estimation properties are concerned, because in practice one encounters a large spatial sample, we make sure that the proposed GQL and MM estimators are consistent.

Introduction

The spatial data are realizations of random variables collected from a sequence of related geographical locations, where the responses (linear, counts or binary) collected from adjacent locations naturally become correlated. These correlations are referred to as the spatial correlations. In a spatial setup for linear data, there exist many studies dealing with spatial regression analysis after accommodating spatial correlations among the responses from neighboring locations. A big list of references on this research topic is available in Cressie and Wikle [8], for example. Most of these linear data based studies model the spatial correlations among responses from neighboring locations, by using a spatial auto-regression (SAR) type model [3], or a linear mixed model (LMM) with varieties of Gaussian or certain patterned covariance structure. Cressie [7], for example, suggests for using various correlation structure such as exponential covariance function, Gaussian covariance function and a reciprocal covariance function. Similar correlation functions but by using the spectral approach, were exploited by Jones and Vecchia [10] (see also, e.g., [15], [21], [22]) in order to define pairwise spatial covariances.

There is also a long history on spatial binary data analysis. See, for example, Besag [4], [5] and Besag [6], for some early studies. For some recent studies over the past three decades, see, for example, Rathban and Cressie [14], Heagerty and Lele [9], Lin and Clayton  [11], and Ainsworth et al. [1]. As far as the model for spatial binary data is concerned, these existing studies mostly used the CAR  [6], BPM [9], and WCQL [11] models to accommodate spatial binary correlations. For convenience, we provide a brief review of these CAR, BPM, and WCQL models in Section 3.1, along with certain difficulties encountered by these models. More specifically it is demonstrated that these widely used models fail to address the within and between spatial familial correlations while developing a spatial correlation structure. In contrary to these models, recently some studies developed a spatial correlation model by accommodating both within and between spatial familial correlations. See, for example, Mariathas and Sutradhar [12] in a linear setup, and Sutradhar and Oyet [18] in a binary setup. In this paper, we apply these ideas [12], [18] in Section 3.2 and develop a general spatial correlation model for exponential family based such as linear, count or binary data. The basic moment properties such as the mean and variance of the exponential family based spatial data are given in Section 2.

In Section 4, it is demonstrated that the proposed two-way correlation approach developed in Section 3.2 produces a block-banded spatial correlation (BBSC) matrix. Once evaluated, the elements of this BBSC matrix clearly exhibit spatial correlations among neighboring locations. More specifically, any zero correlations in a row or column away from the main diagonal elements of the matrix would indicate that the underlying two spatial locations are far away from each other such that all pair-wise distances between their family members are beyond a fixed threshold distance. In the same section, we then obtain the inversion of the BBSC matrix, which subsequently used in Section 5 to develop the so-called GQL (generalized quasi-likelihood) estimating equations for the spatial regression parameters. For the estimation of the random effects variance and correlation parameters we develop the second order response based moment equations. The asymptotic properties such as consistency of these GQL and moment estimators are also given in the same section. The paper concludes in Section 6.

Section snippets

Pair-wise spatial familial random effects based model for exponential family data

Suppose that there are K locations in the entire spatial field where we label the ith, i{1,,K}, spatial location as si following the nearest neighbor distance criterion sisi+1sisi+2,i{1,,K2}.Also suppose that y(si) is an exponential family based such as linear, count or binary response collected from the spatial location si. It then follows by (1) that y(si1) and y(si+1) are two responses from nearest neighborhood of si. Next suppose that xi=(xi1,,xip) denotes the p-dimensional

Existing spatial binary models: A brief review

As indicated in Section 1, among all spatial correlation models developed over the last five decades, the CAR (conditional auto-regressive) model [6], BPM (bivariate probit model) [9], and WCQL (‘working’ correlations based quasi-likelihood) model [11], are widely used or referenced. These three models are, however, quite different from each other. We describe these models in the following three sub-sections, along with their advantages and drawbacks.

Block-banded spatial correlation structure for spatial exponential family data

Our main objective in this section is to demonstrate that as opposed to the CAR, WCQL/WAR(1), and BPM models for spatial binary data discussed in Section 3.1, the moving pair-wise neighboring familial mixed model for spatial data discussed in Section 3.2 generates a block-banded (BB) spatial correlation (BBSC) structure, the responses from two far distant families/locations being uncorrelated. We remark that the construction of this BBSC matrix does not depend on the form of the response data,

Estimation and the asymptotic properties

Given that we have computed the unconditional spatial means, E[Y(si)]=μi(β,σγ2,ϕ) for all i{1,,K}, given by (11) under Theorem 1, and the spatial variance covariance matrix Σ(L1)BB(β,σγ2,ϕ) given by (35) (see also (31)), we can utilize these spatial moments up to order 2 to develop moments based estimating equations for all parameters β,σγ2, and ϕ. More specifically, we develop the first order response based GQL (generalized quasi-likelihood) as well as QL (equivalent to method of moments

Concluding remarks

It is demonstrated in the paper that it is either impossible or difficult to compute the pair-wise spatial correlations under the existing widely used CAR, BPM, and WCQL/WAR(1) models used mainly for the spatial binary data analysis. Consequently, the subsequent likelihood or quasi-likelihood estimates of the regression and possibly correlation index parameters were never used to examine the correlation pattern of the responses. The modeling drawbacks of these approaches get further mounted

Acknowledgments

The author would like to thank the Editor-in-chief and the reviewer for their valuable comments and suggestions that led to the improvement of the paper. This research was partially supported by a grant from the Natural Sciences and the Engineering Research Council of Canada .

References (23)

  • SutradharB.C. et al.

    On marginal quasi-likelihood inferences in generalized linear mixed models

    J. Multivariate Anal.

    (2001)
  • AinsworthL.M. et al.

    Zero-inflated spatial models: Applications and interpretation

  • AsifA. et al.

    Block matrices with L-block-banded inverse: Inversion algorithms

    IEEE Trans. Signal Process.

    (2005)
  • BasuS. et al.

    Regression models with spatially correlated errors

    J. Amer. Statist. Assoc.

    (1994)
  • BesagJ.

    Nearest-neighbour systems and the auto-logistic model for binary data

    J. R. Stat. Soc. Ser. B Stat. Methodol.

    (1972)
  • BesagJ.

    On the correlation structure of some two-dimensional stationary processes

    Biometrika

    (1972)
  • BesagJ.

    Spatial interaction and the statistical analysis of lattice systems

    J. R. Stat. Soc. Ser. B Stat. Methodol.

    (1974)
  • CressieN.

    Statistics for Spatial Data

    (1991)
  • CressieN. et al.

    Statistics for Spatio-Temporal Data

    (2011)
  • HeagertyP.J. et al.

    A composite likelihood approach to binary spatial data

    J. Amer. Statist. Assoc.

    (1998)
  • JonesR.H. et al.

    Fitting continuous ARMA models to unequally spaced spatial data

    J. Amer. Statist. Assoc.

    (1993)
  • View full text