Skip to main content
Log in

Fast Bayesian inference of block Nearest Neighbor Gaussian models for large data

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

This paper presents the development of a spatial block-Nearest Neighbor Gaussian process (blockNNGP) for location-referenced large spatial data. The key idea behind this approach is to divide the spatial domain into several blocks which are dependent under some constraints. The cross-blocks capture the large-scale spatial dependence, while each block captures the small-scale spatial dependence. The resulting blockNNGP enjoys Markov properties reflected on its sparse precision matrix. It is embedded as a prior within the class of latent Gaussian models, thus fast Bayesian inference is obtained using the integrated nested Laplace approximation. The performance of the blockNNGP is illustrated on simulated examples, a comparison of our approach with other methods for analyzing large spatial data and applications with Gaussian and non-Gaussian real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Banerjee, S., Gelfand, A.E., Finley, A.O., Sang, H.: Gaussian predictive process models for large spatial datasets. J. R. Stat. Soc. Ser. B 70, 825–848 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975). https://doi.org/10.1145/361002.361007

    Article  MATH  Google Scholar 

  • Cameletti, M., Lindgren, F., Simpson, D.P., Rue, H.: Spatio-temporal modeling of particulate matter concentration through the spde approach. AStA Adv. Stat. Anal. 97, 109–131 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Caragea, P.C., Smith, R.L.: Approximate likelihoods for spatial processes. In: Joint Statistical Meetings - Section on Statistics & the Environment (2007). https://doi.org/10.1016/j.jmva.2006.08.010

  • Cressie, N.: Statistics for Spatial Data. Wiley Classics Library (1993)

  • Datta, A., Banerjee, S., Finley, A.O., Gelfand, A.E.: Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets. J. Am. Stat. Assoc. 111(514), 800–812 (2016)

    Article  MathSciNet  Google Scholar 

  • Diggle, P.J., Tawn, J.A., Moyeed, R.A.: Model-based geostatistics. J. R. Stat. Soc. Ser. C (Appl. Stat.) 47(3), 299–350 (1998). https://doi.org/10.1111/1467-9876.00113

    Article  MathSciNet  MATH  Google Scholar 

  • Eidsvik, J., Shaby, B.A., Reich, B.J., Wheeler, M., Niemi, J.: Estimation and prediction in spatial models with block composite likelihoods. J. Comput. Graph. Stat. 23(2), 295–315 (2014)

    Article  MathSciNet  Google Scholar 

  • Finley, A.O., Datta, A., Cook, B.D., Morton, D.C., Andersen, H.E., Banerjee, S.: Efficient algorithms for Bayesian nearest neighbor Gaussian processes. J. Comput. Graph. Stat. 1–14 (2019)

  • Furrer, R., Genton, M.G., Nychka, D.: Covariance tapering for interpolation of large spatial datasets. J. Comput. Graph. Stat. 15(3), 502–523 (2006). https://doi.org/10.1198/106186006X132178

    Article  MathSciNet  Google Scholar 

  • Guinness, J.: Permutation and grouping methods for sharpening gaussian process approximations. Technometrics 60(4), 415–429 (2018). https://doi.org/10.1080/00401706.2018.1437476

  • Katzfuss, M.: A multi-resolution approximation for massive spatial datasets. J. Am. Stat. Assoc. 112(517), 201–214 (2017). https://doi.org/10.1080/01621459.2015.1123632

    Article  MathSciNet  Google Scholar 

  • Katzfuss, M., Gong, W.: A class of multiresolution approximation for large datasets. Stat. Sin. 30(4), 2203–2226 (2020)

    MATH  Google Scholar 

  • Katzfuss, M., Guinness, J.: A general framework for Vecchia approximations of Gaussian processes. Stat. Sci. 36(1) (2021)

  • Kaufman, C.G., Shaby, B.A.: The role of the range parameter for estimation and prediction in geostatistics. Biometrika 100(2), 473–484 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Kaufman, C.G., Schervish, M.J., Nychka, D.W.: Covariance tapering for likelihood-based estimation in large spatial data sets. J. Am. Stat. Assoc. 103(484), 1545–1555 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Lee, B.S., Haran, M.: Picar: an efficient extendable approach for fitting hierarchical spatial models. Technometrics 64(2), 187–198 (2022)

    Article  MathSciNet  Google Scholar 

  • Lindgren, F., Rue, H., Lindström, J.: An explicit link between Gaussian fields and Gaussian Markov random fields: the SPDE approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 73(4), 423–498 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Nychka, D., Bandyopadhyay, S., Hammerling, D., Lindgren, F., Sain, S.: A multiresolution gaussian process model for the analysis of large spatial datasets. J. Comput. Graph. Stat. 24(2), 579–599 (2015). https://doi.org/10.1080/10618600.2014.914946

    Article  MathSciNet  Google Scholar 

  • Peruzzi, M., Banerjee, S., Finley, A.O.: Highly scalable Bayesian geostatistical modeling via meshed gaussian processes on partitioned domains. J. Am. Stat. Assoc. 117(538), 969–982 (2022). https://doi.org/10.1080/01621459.2020.1833889

    Article  MathSciNet  MATH  Google Scholar 

  • Quiroz, Z., Prates, M., Dey, D., Rue, H.: Fast Bayesian inference of block nearest neighbor Gaussian process for large data. Tech. Rep. (2019) arxiv:1908.06437pdf

  • Rue, H., Tjelmeland, H.: Fitting Gaussian Markov random fields to Gaussian fields. Scand. J. Stat. 29(1), 31–50 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • Rue, H., Martino, S., Chopin, N.: Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc. B 71(2), 319–392 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Stein, M.L.: Statistical properties of covariance tapers. J. Comput. Graph. Stat. 22(4), 866–885 (2013). https://doi.org/10.1080/10618600.2012.719844

    Article  MathSciNet  Google Scholar 

  • Stein, M.L., Chi, Z., JWelty, L.: Approximating likelihoods for large spatial data sets. J. R. Stat. Soc. Ser. B 66(2), 275–296 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  • Zilber, D., Katzfuss, M.: Vecchia–Laplace approximations of generalized Gaussian processes for big non-Gaussian spatial data. Comput. Stat. Data Anal. 153(107081) (2021)

Download references

Acknowledgements

The author would like to thank the referee for the suggestions that we believe drastically improved the context of the paper. Also, Zaida C. Quiroz would like to thank the Pontificia Universidad Católica del Perú for partial financial support through the project [DGI-2019-740]. Marcos O. Prates would like to acknowledge partial funding support from Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) grants 436948/2018-4 and 307547/2018-4 and Fundação de Amparo à Pesquisa do Estado de Minas Gerais (FAPEMIG) grant APQ-01837-22.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zaida C. Quiroz.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Materials:

[A:] Formal proofs. The full-MCMC and collapsed MCMC algorithms are detailed. More simulation and application results (pdf). [B:] Codes to run NNGP and blockNNGP models through R-INLA are available at https://github.com/Zaidajqc/blockNNGP. (pdf 5,153KB)

Apendix: Proof of proposition 2

Apendix: Proof of proposition 2

Matrix Analysis Background

Theorem A1: A matrix \({\varvec{B}} \in {\varvec{\Re }}^{m\times n}\) is full column rank if and only if \({\varvec{B^{T}B}}\) is invertible

Theorem A2: The determinant of an \(n\times n\) matrix \({\varvec{B}}\) is 0 if and only if the matrix \({\varvec{B}}\) is not invertible.

Theorem A3: Let \({\varvec{T_n}}\) be a triangular matrix (either upper or lower) of order n. Let \(|{\varvec{T_n}}|\) be the determinant of \({\varvec{T_n}}\). Then \(|{\varvec{T_n}}|\) is equal to the product of all the diagonal elements of \({\varvec{T_n}},\) that is, \(|{\varvec{T_n}}| = \prod _{k=1}^{n}(a_{kk}).\)

Proposition A1: If \({\varvec{B}}\) is positive definite (p.d.), then if \({\varvec{S}}\) has full column rank, then \({\varvec{S^{T}BS}}\) is positive definite.

Corollary A1: If \({\varvec{B}}\) is positive definite, then \({\varvec{B^{-1}}}\) is positive definite.

Proof

Without loss of generality, assume that the data were reordered by blocks. From known properties of Gaussian distributions,

$$\begin{aligned} {\varvec{w_{b_k}}}|{\varvec{w_{N(b_k)}}} \sim N({\varvec{B_{b_k}}} {\varvec{w_{N(b_k)}}}, {\varvec{F_{b_k}}}), \end{aligned}$$

where \({\varvec{B_{b_k}}} = {\varvec{C_{b_k, N(b_k)}}} {\varvec{C^{-1}_{N(b_k)}}}\) and \({\varvec{F_{b_k}}} = {\varvec{C_{b_k}}} - {\varvec{C_{b_k, N(b_k)}}} {\varvec{C^{-1}_{N(b_k)}}}{\varvec{C_{ N(b_k),b_k}}}.\) Hence,

$$\begin{aligned}{} & {} \widetilde{\pi }({\varvec{w_S}}) = \prod _{k=1}^M p({\varvec{w_{b_k}}}|{\varvec{w_{N(b_k)}}}) \\{} & {} \quad \propto \prod _{k=1}^M \frac{1}{|{\varvec{F_{b_k}}}|^{1/2}} \exp \\{} & {} \quad \left\{ -\frac{1}{2} ({\varvec{w_{b_k}}} - {\varvec{B_{b_k}}} {\varvec{w_{N(b_k)}}})^T {\varvec{F_{b_k}}}^{-1} ({\varvec{w_{b_k}}} - {\varvec{B_{b_k}}} {\varvec{w_{N(b_k)}}}) \right\} \\{} & {} \quad \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \\{} & {} \quad \left\{ -\frac{1}{2} \sum _{k=1}^M ({\varvec{w_{b_k}}} - {\varvec{B_{b_k}}} {\varvec{w_{N(b_k)}}})^T {\varvec{F_{b_k}}}^{-1} ({\varvec{w_{b_k}}} - {\varvec{B_{b_k}}} {\varvec{w_{N(b_k)}}}) \right\} . \end{aligned}$$

Let \({\varvec{w_{b_k}}} - {\varvec{B_{b_k}}} {\varvec{w_{N(b_k)}}} = {\varvec{B^\star _{b_k}}} {\varvec{w_S}}\), and j be the j-th observation of block \({\varvec{b_k}}\), then \(\forall k = 1, \dots , M\), \(i = 1, \dots , n\) and \( j = 1, \dots , n_{bk}\):

$$\begin{aligned}{} & {} B^{\star }_{b_k} (j,i) \\{} & {} \quad = {\left\{ \begin{array}{ll} 1 &{} \quad \text {if } s_i \in {\varvec{s_{b_k}}} \\ B_{b_i}[j,l] &{} \quad \text {if } s_i \in {\varvec{s_{b_k}}}; s_i = s_{N(b_k)}[l]; l= 1, \dots , N_{bk} \\ 0 &{} \quad \text {otherwise,} \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} {\varvec{B^{\star }_{b_k}}} = \begin{bmatrix} {\varvec{B^{\star }_{b_k} (1)}} \\ \vdots \\ {\varvec{B^{\star }_{b_k} (j)}} \\ \vdots \\ {\varvec{B^{\star }_{b_k} (n_{b_k})}} \end{bmatrix} _{n_{b_k} \times n},\end{aligned}$$

where \({\varvec{B^{\star }_{b_k} (j)}}= (B^{\star }_{b_k} (j,1),B^{\star }_{b_k} (j,2),\dots , B^{\star }_{b_k} (j,n))\). From these definitions, \( {\varvec{B^{\star }_{b_k}}}\) is a matrix with i-th column full of zeros if \(s_i\notin {\varvec{s_{b_k}}}\) or \( s_i\notin {\varvec{N(s_{b_k})}}\). Since the data were reordered by blocks and the neighbor blocks are from the past, \({\varvec{B^{\star }_{b_k}}}\) has also the next form:

where \({\varvec{A_k}}\) is a \(n_{bk}\times n_{bk}\) matrix and \({\varvec{R_k}}\) is a \(n_{bk}\times \sum _{r=1}^{k-1} n_{br}\) matrix with at least one column with none null-element if \(nb \ne 0\).

Then,

$$\begin{aligned}{} & {} \widetilde{\pi }({\varvec{w_S}}) \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \\{} & {} \quad \left\{ -\frac{1}{2} \sum _{k=1}^M ({\varvec{B^\star _{b_k}}} {\varvec{w_S}})^T {\varvec{F_{b_k}^{-1}}} ({\varvec{B^\star _{b_k}}} {\varvec{w_S}}) \right\} \\{} & {} \quad \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \\{} & {} \quad \left\{ -\frac{1}{2} \sum _{k=1}^M {\varvec{w_S^T}} ({\varvec{B_{b_k}^\star }})^T {\varvec{F_{b_k}^{-1}}} ({\varvec{B^\star _{b_k}}} {\varvec{w_S}}) \right\} \\{} & {} \quad \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \\{} & {} \quad \left\{ -\frac{1}{2} \sum _{k=1}^M {\varvec{w_S^T}} ( ({\varvec{B_{b_k}^\star }})^T {\varvec{F_{b_k}^{-1}}} {\varvec{B_{b_k}}}^{\star } ) {\varvec{w_S}} \right\} \\{} & {} \quad \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \\{} & {} \quad \left\{ -\frac{1}{2} {\varvec{w_S}}^T ( \sum _{k=1}^M ({\varvec{B_{b_k}}}^\star )^T {\varvec{F_{b_k}^{-1}}} {\varvec{B_{b_k}}}^{\star } ) {\varvec{w_S}} \right\} . \end{aligned}$$

Let \(\sum _{k=1}^M ({\varvec{B_{b_k}}}^\star )^T {\varvec{F^{-1}_{b_k}}} {\varvec{B_{b_k}}}^{\star } = ({\varvec{B^{\star }_s}})^T {\varvec{F_s ^{-1}}} {\varvec{B^{\star }_s}} \), where \({\varvec{B_{s}}} = \left[ \begin{array}{ccccc} {\varvec{B_{b1}^{\star }}} &{} \ldots &{} \ldots &{} \dots &{} {\varvec{B_{bM}^{\star }}} \\ \end{array}\right] \) and \({\varvec{F_s ^{-1}}} = \text{ diag }( {\varvec{F_{b_k} ^{-1}}})\). \( {\varvec{F_s ^{-1}}} \) is a block diagonal matrix And given that \( {\varvec{B^{\star }_{b_k}}}\) is a matrix with i-th column full of zeros for \(i> \sum _{r=1}^{k} n_{br}\), then \({\varvec{B_s}}\) is a block matrix and lower triangular.

Then, \(\widetilde{p}({\varvec{w_S}}) \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \left\{ -\frac{1}{2} {\varvec{w_S^T}} ( {\varvec{B_s^T F_s ^{-1}B_s}}) {\varvec{w_S}} \right\} \) and \( {\varvec{{\widetilde{C}}_s^{-1}}} = {\varvec{{\widetilde{Q}}_s}}= {\varvec{B_s^T F_s ^{-1}B_s}}.\)

We also need to prove that \({\varvec{\widetilde{Q}_s}}\) is positive definite. From properties of the Normal distribution, the covariance of the conditional distribution of \({\varvec{w_{b_k}}}|{\varvec{w_{N(b_k)}}}\) is also p.d. (by Schur complement conditions), then

$$\begin{aligned} {\varvec{F_{bk}}} = {\varvec{C_{b_k}}} - {\varvec{C_{b_k,N(b_k)}}}{\varvec{C^{-1}_{N(b_k)}}}{\varvec{C_{N(b_k),b_k}}}, \end{aligned}$$

is p.d. Moreover, \({\varvec{F_s}} = diag({\varvec{F_{bk}}})\) and a block diagonal matrix is p.d. iff each diagonal block is positive definite, so given that \({\varvec{F_{bk}}}\) is p.d. and \({\varvec{F_s}}\) is block diagonal with blocks \({\varvec{F_{bk}}}\) p.d then \({\varvec{F_s}}\) is p.d. By Corollary A1, \({\varvec{F_s}}\) is p.d. then \({\varvec{F^{-1}_s}}\) is p.d. By Theorem A1, \({\varvec{B_s}}\) has full column rank if and only iff \({\varvec{R_s}} = {\varvec{B^T_s B_s}}\) is invertible. By Theorem A2, the inverse of \({\varvec{R_s}}\) exists if and only if \(|{\varvec{R_s}}|\ne 0\). Using the well-known matrix theorems, we can prove the following: \(|{\varvec{R_s}}|= |{\varvec{B^T_s B_s}}| = |{\varvec{B^T_s}}| |{\varvec{B_s}}| \ne 0\) if \(|{\varvec{B^T_s}}| = |{\varvec{B_s}}| \ne 0\). Given that \({\varvec{B_s}}\) is a lower triangular matrix, by Theorem A3, \(|{\varvec{B_s}}| = \prod _{k=1}^{n}(b_{kk})\). And, \(b_{kk} =1, \forall k\), then \(|{\varvec{B_s}}| \ne 0.\) So, \({\varvec{R_s}}\) is invertible and \({\varvec{B_s}}\) has full column rank. By Proposition A1, given that \({\varvec{B_s}}\) has full column rank, and \({\varvec{F_s^{-1}}}\) is p.d. then \({\varvec{\widetilde{Q}_s}} = {\varvec{\widetilde{C}_s^{-1}}} = {\varvec{B^T_s F_s^{-1} B_s}}\) is p.d.

Also by corollary A1, \({\varvec{\widetilde{C}_s^{-1}}}\) is p.d. then \({\varvec{\widetilde{C}_s}}\) is p.d. Finally, since \({\widetilde{p}}({\varvec{w_S}}) \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \left\{ -\frac{1}{2} {\varvec{w_S}}^T ( {\varvec{{\widetilde{C}}_s}}^{-1}) {\varvec{w_S}} \right\} \), \({\varvec{{\widetilde{C}}_s}}^{-1} = {\varvec{B_s^T F_s ^{-1}B_s}}\), and \({\varvec{\widetilde{C}_s}}\) is p.d., then \({\widetilde{p}}({\varvec{w_S}})\) is a pdf of a multivariate normal distribution. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Quiroz, Z.C., Prates, M.O., Dey, D.K. et al. Fast Bayesian inference of block Nearest Neighbor Gaussian models for large data. Stat Comput 33, 54 (2023). https://doi.org/10.1007/s11222-023-10227-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-023-10227-1

Keywords

Navigation