Fast Bayesian inference of block Nearest Neighbor Gaussian models for large data

Quiroz, Zaida C.; Prates, Marcos O.; Dey, Dipak K.; Rue, H.åvard

doi:10.1007/s11222-023-10227-1

Fast Bayesian inference of block Nearest Neighbor Gaussian models for large data

Original Paper
Published: 23 March 2023

Volume 33, article number 54, (2023)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Zaida C. Quiroz ORCID: orcid.org/0000-0003-3821-0815¹,
Marcos O. Prates²,
Dipak K. Dey³ &
…
H.åvard Rue⁴

249 Accesses
1 Altmetric
Explore all metrics

Abstract

This paper presents the development of a spatial block-Nearest Neighbor Gaussian process (blockNNGP) for location-referenced large spatial data. The key idea behind this approach is to divide the spatial domain into several blocks which are dependent under some constraints. The cross-blocks capture the large-scale spatial dependence, while each block captures the small-scale spatial dependence. The resulting blockNNGP enjoys Markov properties reflected on its sparse precision matrix. It is embedded as a prior within the class of latent Gaussian models, thus fast Bayesian inference is obtained using the integrated nested Laplace approximation. The performance of the blockNNGP is illustrated on simulated examples, a comparison of our approach with other methods for analyzing large spatial data and applications with Gaussian and non-Gaussian real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Density-Based Clustering Based on Hierarchical Density Estimates

Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion?

Article 18 October 2014

Check your outliers! An introduction to identifying statistical outliers in R with easystats

Article 25 March 2024

References

Banerjee, S., Gelfand, A.E., Finley, A.O., Sang, H.: Gaussian predictive process models for large spatial datasets. J. R. Stat. Soc. Ser. B 70, 825–848 (2008)
Article MathSciNet MATH Google Scholar
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975). https://doi.org/10.1145/361002.361007
Article MATH Google Scholar
Cameletti, M., Lindgren, F., Simpson, D.P., Rue, H.: Spatio-temporal modeling of particulate matter concentration through the spde approach. AStA Adv. Stat. Anal. 97, 109–131 (2013)
Article MathSciNet MATH Google Scholar
Caragea, P.C., Smith, R.L.: Approximate likelihoods for spatial processes. In: Joint Statistical Meetings - Section on Statistics & the Environment (2007). https://doi.org/10.1016/j.jmva.2006.08.010
Cressie, N.: Statistics for Spatial Data. Wiley Classics Library (1993)
Datta, A., Banerjee, S., Finley, A.O., Gelfand, A.E.: Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets. J. Am. Stat. Assoc. 111(514), 800–812 (2016)
Article MathSciNet Google Scholar
Diggle, P.J., Tawn, J.A., Moyeed, R.A.: Model-based geostatistics. J. R. Stat. Soc. Ser. C (Appl. Stat.) 47(3), 299–350 (1998). https://doi.org/10.1111/1467-9876.00113
Article MathSciNet MATH Google Scholar
Eidsvik, J., Shaby, B.A., Reich, B.J., Wheeler, M., Niemi, J.: Estimation and prediction in spatial models with block composite likelihoods. J. Comput. Graph. Stat. 23(2), 295–315 (2014)
Article MathSciNet Google Scholar
Finley, A.O., Datta, A., Cook, B.D., Morton, D.C., Andersen, H.E., Banerjee, S.: Efficient algorithms for Bayesian nearest neighbor Gaussian processes. J. Comput. Graph. Stat. 1–14 (2019)
Furrer, R., Genton, M.G., Nychka, D.: Covariance tapering for interpolation of large spatial datasets. J. Comput. Graph. Stat. 15(3), 502–523 (2006). https://doi.org/10.1198/106186006X132178
Article MathSciNet Google Scholar
Guinness, J.: Permutation and grouping methods for sharpening gaussian process approximations. Technometrics 60(4), 415–429 (2018). https://doi.org/10.1080/00401706.2018.1437476
Katzfuss, M.: A multi-resolution approximation for massive spatial datasets. J. Am. Stat. Assoc. 112(517), 201–214 (2017). https://doi.org/10.1080/01621459.2015.1123632
Article MathSciNet Google Scholar
Katzfuss, M., Gong, W.: A class of multiresolution approximation for large datasets. Stat. Sin. 30(4), 2203–2226 (2020)
MATH Google Scholar
Katzfuss, M., Guinness, J.: A general framework for Vecchia approximations of Gaussian processes. Stat. Sci. 36(1) (2021)
Kaufman, C.G., Shaby, B.A.: The role of the range parameter for estimation and prediction in geostatistics. Biometrika 100(2), 473–484 (2013)
Article MathSciNet MATH Google Scholar
Kaufman, C.G., Schervish, M.J., Nychka, D.W.: Covariance tapering for likelihood-based estimation in large spatial data sets. J. Am. Stat. Assoc. 103(484), 1545–1555 (2008)
Article MathSciNet MATH Google Scholar
Lee, B.S., Haran, M.: Picar: an efficient extendable approach for fitting hierarchical spatial models. Technometrics 64(2), 187–198 (2022)
Article MathSciNet Google Scholar
Lindgren, F., Rue, H., Lindström, J.: An explicit link between Gaussian fields and Gaussian Markov random fields: the SPDE approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 73(4), 423–498 (2011)
Article MathSciNet MATH Google Scholar
Nychka, D., Bandyopadhyay, S., Hammerling, D., Lindgren, F., Sain, S.: A multiresolution gaussian process model for the analysis of large spatial datasets. J. Comput. Graph. Stat. 24(2), 579–599 (2015). https://doi.org/10.1080/10618600.2014.914946
Article MathSciNet Google Scholar
Peruzzi, M., Banerjee, S., Finley, A.O.: Highly scalable Bayesian geostatistical modeling via meshed gaussian processes on partitioned domains. J. Am. Stat. Assoc. 117(538), 969–982 (2022). https://doi.org/10.1080/01621459.2020.1833889
Article MathSciNet MATH Google Scholar
Quiroz, Z., Prates, M., Dey, D., Rue, H.: Fast Bayesian inference of block nearest neighbor Gaussian process for large data. Tech. Rep. (2019) arxiv:1908.06437pdf
Rue, H., Tjelmeland, H.: Fitting Gaussian Markov random fields to Gaussian fields. Scand. J. Stat. 29(1), 31–50 (2002)
Article MathSciNet MATH Google Scholar
Rue, H., Martino, S., Chopin, N.: Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc. B 71(2), 319–392 (2009)
Article MathSciNet MATH Google Scholar
Stein, M.L.: Statistical properties of covariance tapers. J. Comput. Graph. Stat. 22(4), 866–885 (2013). https://doi.org/10.1080/10618600.2012.719844
Article MathSciNet Google Scholar
Stein, M.L., Chi, Z., JWelty, L.: Approximating likelihoods for large spatial data sets. J. R. Stat. Soc. Ser. B 66(2), 275–296 (2004)
Article MathSciNet MATH Google Scholar
Zilber, D., Katzfuss, M.: Vecchia–Laplace approximations of generalized Gaussian processes for big non-Gaussian spatial data. Comput. Stat. Data Anal. 153(107081) (2021)

Download references

Acknowledgements

The author would like to thank the referee for the suggestions that we believe drastically improved the context of the paper. Also, Zaida C. Quiroz would like to thank the Pontificia Universidad Católica del Perú for partial financial support through the project [DGI-2019-740]. Marcos O. Prates would like to acknowledge partial funding support from Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) grants 436948/2018-4 and 307547/2018-4 and Fundação de Amparo à Pesquisa do Estado de Minas Gerais (FAPEMIG) grant APQ-01837-22.

Author information

Authors and Affiliations

Department of Sciences, Pontificia Universidad Católica del Perú, Lima, Peru
Zaida C. Quiroz
Department of Statistics, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
Marcos O. Prates
Department of Statistics, University of Connecticut, Storrs, USA
Dipak K. Dey
Computer, Electrical and Mathematical Science and Engineering, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
H.åvard Rue

Authors

Zaida C. Quiroz
View author publications
You can also search for this author in PubMed Google Scholar
Marcos O. Prates
View author publications
You can also search for this author in PubMed Google Scholar
Dipak K. Dey
View author publications
You can also search for this author in PubMed Google Scholar
H.åvard Rue
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zaida C. Quiroz.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Materials:

[A:] Formal proofs. The full-MCMC and collapsed MCMC algorithms are detailed. More simulation and application results (pdf). [B:] Codes to run NNGP and blockNNGP models through R-INLA are available at https://github.com/Zaidajqc/blockNNGP. (pdf 5,153KB)

Apendix: Proof of proposition 2

Matrix Analysis Background

Theorem A1: A matrix ${\varvec{B}} \in {\varvec{\Re }}^{m\times n}$ is full column rank if and only if ${\varvec{B^{T}B}}$ is invertible

Theorem A2: The determinant of an $n\times n$ matrix ${\varvec{B}}$ is 0 if and only if the matrix ${\varvec{B}}$ is not invertible.

Theorem A3: Let ${\varvec{T_n}}$ be a triangular matrix (either upper or lower) of order n. Let $|{\varvec{T_n}}|$ be the determinant of ${\varvec{T_n}}$. Then $|{\varvec{T_n}}|$ is equal to the product of all the diagonal elements of ${\varvec{T_n}},$ that is, $|{\varvec{T_n}}| = \prod _{k=1}^{n}(a_{kk}).$

Proposition A1: If ${\varvec{B}}$ is positive definite (p.d.), then if ${\varvec{S}}$ has full column rank, then ${\varvec{S^{T}BS}}$ is positive definite.

Corollary A1: If ${\varvec{B}}$ is positive definite, then ${\varvec{B^{-1}}}$ is positive definite.

Proof

Without loss of generality, assume that the data were reordered by blocks. From known properties of Gaussian distributions,

$$\begin{aligned} {\varvec{w_{b_k}}}|{\varvec{w_{N(b_k)}}} \sim N({\varvec{B_{b_k}}} {\varvec{w_{N(b_k)}}}, {\varvec{F_{b_k}}}), \end{aligned}$$

where ${\varvec{B_{b_k}}} = {\varvec{C_{b_k, N(b_k)}}} {\varvec{C^{-1}_{N(b_k)}}}$ and ${\varvec{F_{b_k}}} = {\varvec{C_{b_k}}} - {\varvec{C_{b_k, N(b_k)}}} {\varvec{C^{-1}_{N(b_k)}}}{\varvec{C_{ N(b_k),b_k}}}.$ Hence,

$$\begin{aligned}{} & {} \widetilde{\pi }({\varvec{w_S}}) = \prod _{k=1}^M p({\varvec{w_{b_k}}}|{\varvec{w_{N(b_k)}}}) \\{} & {} \quad \propto \prod _{k=1}^M \frac{1}{|{\varvec{F_{b_k}}}|^{1/2}} \exp \\{} & {} \quad \left\{ -\frac{1}{2} ({\varvec{w_{b_k}}} - {\varvec{B_{b_k}}} {\varvec{w_{N(b_k)}}})^T {\varvec{F_{b_k}}}^{-1} ({\varvec{w_{b_k}}} - {\varvec{B_{b_k}}} {\varvec{w_{N(b_k)}}}) \right\} \\{} & {} \quad \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \\{} & {} \quad \left\{ -\frac{1}{2} \sum _{k=1}^M ({\varvec{w_{b_k}}} - {\varvec{B_{b_k}}} {\varvec{w_{N(b_k)}}})^T {\varvec{F_{b_k}}}^{-1} ({\varvec{w_{b_k}}} - {\varvec{B_{b_k}}} {\varvec{w_{N(b_k)}}}) \right\} . \end{aligned}$$

Let ${\varvec{w_{b_k}}} - {\varvec{B_{b_k}}} {\varvec{w_{N(b_k)}}} = {\varvec{B^\star _{b_k}}} {\varvec{w_S}}$, and j be the j-th observation of block ${\varvec{b_k}}$, then $\forall k = 1, \dots , M$, $i = 1, \dots , n$ and $ j = 1, \dots , n_{bk}$:

$$\begin{aligned}{} & {} B^{\star }_{b_k} (j,i) \\{} & {} \quad = {\left\{ \begin{array}{ll} 1 &{} \quad \text {if } s_i \in {\varvec{s_{b_k}}} \\ B_{b_i}[j,l] &{} \quad \text {if } s_i \in {\varvec{s_{b_k}}}; s_i = s_{N(b_k)}[l]; l= 1, \dots , N_{bk} \\ 0 &{} \quad \text {otherwise,} \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} {\varvec{B^{\star }_{b_k}}} = \begin{bmatrix} {\varvec{B^{\star }_{b_k} (1)}} \\ \vdots \\ {\varvec{B^{\star }_{b_k} (j)}} \\ \vdots \\ {\varvec{B^{\star }_{b_k} (n_{b_k})}} \end{bmatrix} _{n_{b_k} \times n},\end{aligned}$$

where ${\varvec{B^{\star }_{b_k} (j)}}= (B^{\star }_{b_k} (j,1),B^{\star }_{b_k} (j,2),\dots , B^{\star }_{b_k} (j,n))$. From these definitions, $ {\varvec{B^{\star }_{b_k}}}$ is a matrix with i-th column full of zeros if $s_i\notin {\varvec{s_{b_k}}}$ or $ s_i\notin {\varvec{N(s_{b_k})}}$. Since the data were reordered by blocks and the neighbor blocks are from the past, ${\varvec{B^{\star }_{b_k}}}$ has also the next form:

where ${\varvec{A_k}}$ is a $n_{bk}\times n_{bk}$ matrix and ${\varvec{R_k}}$ is a $n_{bk}\times \sum _{r=1}^{k-1} n_{br}$ matrix with at least one column with none null-element if $nb \ne 0$.

Then,

$$\begin{aligned}{} & {} \widetilde{\pi }({\varvec{w_S}}) \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \\{} & {} \quad \left\{ -\frac{1}{2} \sum _{k=1}^M ({\varvec{B^\star _{b_k}}} {\varvec{w_S}})^T {\varvec{F_{b_k}^{-1}}} ({\varvec{B^\star _{b_k}}} {\varvec{w_S}}) \right\} \\{} & {} \quad \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \\{} & {} \quad \left\{ -\frac{1}{2} \sum _{k=1}^M {\varvec{w_S^T}} ({\varvec{B_{b_k}^\star }})^T {\varvec{F_{b_k}^{-1}}} ({\varvec{B^\star _{b_k}}} {\varvec{w_S}}) \right\} \\{} & {} \quad \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \\{} & {} \quad \left\{ -\frac{1}{2} \sum _{k=1}^M {\varvec{w_S^T}} ( ({\varvec{B_{b_k}^\star }})^T {\varvec{F_{b_k}^{-1}}} {\varvec{B_{b_k}}}^{\star } ) {\varvec{w_S}} \right\} \\{} & {} \quad \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \\{} & {} \quad \left\{ -\frac{1}{2} {\varvec{w_S}}^T ( \sum _{k=1}^M ({\varvec{B_{b_k}}}^\star )^T {\varvec{F_{b_k}^{-1}}} {\varvec{B_{b_k}}}^{\star } ) {\varvec{w_S}} \right\} . \end{aligned}$$

Let $\sum _{k=1}^M ({\varvec{B_{b_k}}}^\star )^T {\varvec{F^{-1}_{b_k}}} {\varvec{B_{b_k}}}^{\star } = ({\varvec{B^{\star }_s}})^T {\varvec{F_s ^{-1}}} {\varvec{B^{\star }_s}} $, where ${\varvec{B_{s}}} = \left[ \begin{array}{ccccc} {\varvec{B_{b1}^{\star }}} &{} \ldots &{} \ldots &{} \dots &{} {\varvec{B_{bM}^{\star }}} \\ \end{array}\right] $ and ${\varvec{F_s ^{-1}}} = \text{ diag }( {\varvec{F_{b_k} ^{-1}}})$. $ {\varvec{F_s ^{-1}}} $ is a block diagonal matrix And given that $ {\varvec{B^{\star }_{b_k}}}$ is a matrix with i-th column full of zeros for $i> \sum _{r=1}^{k} n_{br}$, then ${\varvec{B_s}}$ is a block matrix and lower triangular.

Then, $\widetilde{p}({\varvec{w_S}}) \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \left\{ -\frac{1}{2} {\varvec{w_S^T}} ( {\varvec{B_s^T F_s ^{-1}B_s}}) {\varvec{w_S}} \right\} $ and $ {\varvec{{\widetilde{C}}_s^{-1}}} = {\varvec{{\widetilde{Q}}_s}}= {\varvec{B_s^T F_s ^{-1}B_s}}.$

We also need to prove that ${\varvec{\widetilde{Q}_s}}$ is positive definite. From properties of the Normal distribution, the covariance of the conditional distribution of ${\varvec{w_{b_k}}}|{\varvec{w_{N(b_k)}}}$ is also p.d. (by Schur complement conditions), then

$$\begin{aligned} {\varvec{F_{bk}}} = {\varvec{C_{b_k}}} - {\varvec{C_{b_k,N(b_k)}}}{\varvec{C^{-1}_{N(b_k)}}}{\varvec{C_{N(b_k),b_k}}}, \end{aligned}$$

is p.d. Moreover, ${\varvec{F_s}} = diag({\varvec{F_{bk}}})$ and a block diagonal matrix is p.d. iff each diagonal block is positive definite, so given that ${\varvec{F_{bk}}}$ is p.d. and ${\varvec{F_s}}$ is block diagonal with blocks ${\varvec{F_{bk}}}$ p.d then ${\varvec{F_s}}$ is p.d. By Corollary A1, ${\varvec{F_s}}$ is p.d. then ${\varvec{F^{-1}_s}}$ is p.d. By Theorem A1, ${\varvec{B_s}}$ has full column rank if and only iff ${\varvec{R_s}} = {\varvec{B^T_s B_s}}$ is invertible. By Theorem A2, the inverse of ${\varvec{R_s}}$ exists if and only if $|{\varvec{R_s}}|\ne 0$. Using the well-known matrix theorems, we can prove the following: $|{\varvec{R_s}}|= |{\varvec{B^T_s B_s}}| = |{\varvec{B^T_s}}| |{\varvec{B_s}}| \ne 0$ if $|{\varvec{B^T_s}}| = |{\varvec{B_s}}| \ne 0$. Given that ${\varvec{B_s}}$ is a lower triangular matrix, by Theorem A3, $|{\varvec{B_s}}| = \prod _{k=1}^{n}(b_{kk})$. And, $b_{kk} =1, \forall k$, then $|{\varvec{B_s}}| \ne 0.$ So, ${\varvec{R_s}}$ is invertible and ${\varvec{B_s}}$ has full column rank. By Proposition A1, given that ${\varvec{B_s}}$ has full column rank, and ${\varvec{F_s^{-1}}}$ is p.d. then ${\varvec{\widetilde{Q}_s}} = {\varvec{\widetilde{C}_s^{-1}}} = {\varvec{B^T_s F_s^{-1} B_s}}$ is p.d.

Also by corollary A1, ${\varvec{\widetilde{C}_s^{-1}}}$ is p.d. then ${\varvec{\widetilde{C}_s}}$ is p.d. Finally, since ${\widetilde{p}}({\varvec{w_S}}) \propto \frac{1}{ \prod _{k=1}^M |{\varvec{F_{b_k}}}|^{1/2}} \exp \left\{ -\frac{1}{2} {\varvec{w_S}}^T ( {\varvec{{\widetilde{C}}_s}}^{-1}) {\varvec{w_S}} \right\} $, ${\varvec{{\widetilde{C}}_s}}^{-1} = {\varvec{B_s^T F_s ^{-1}B_s}}$, and ${\varvec{\widetilde{C}_s}}$ is p.d., then ${\widetilde{p}}({\varvec{w_S}})$ is a pdf of a multivariate normal distribution. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Quiroz, Z.C., Prates, M.O., Dey, D.K. et al. Fast Bayesian inference of block Nearest Neighbor Gaussian models for large data. Stat Comput 33, 54 (2023). https://doi.org/10.1007/s11222-023-10227-1

Download citation

Received: 05 October 2021
Accepted: 20 February 2023
Published: 23 March 2023
DOI: https://doi.org/10.1007/s11222-023-10227-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast Bayesian inference of block Nearest Neighbor Gaussian models for large data

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion?

Check your outliers! An introduction to identifying statistical outliers in R with easystats

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary Materials:

Apendix: Proof of proposition 2

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast Bayesian inference of block Nearest Neighbor Gaussian models for large data

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion?

Check your outliers﻿! An introduction to identifying statistical outliers in R with easystats

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary Materials:

Apendix: Proof of proposition 2

Apendix: Proof of proposition 2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

Check your outliers! An introduction to identifying statistical outliers in R with easystats