Adaptive sampling for Bayesian geospatial models

Yang, Hongxia; Liu, Fei; Ji, Chunlin; Dunson, David

doi:10.1007/s11222-013-9422-4

Adaptive sampling for Bayesian geospatial models

Published: 11 September 2013

Volume 24, pages 1101–1110, (2014)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Hongxia Yang¹,
Fei Liu²,
Chunlin Ji³ &
…
David Dunson⁴

473 Accesses
3 Citations
Explore all metrics

Abstract

Bayesian hierarchical modeling with Gaussian process random effects provides a popular approach for analyzing point-referenced spatial data. For large spatial data sets, however, generic posterior sampling is infeasible due to the extremely high computational burden in decomposing the spatial correlation matrix. In this paper, we propose an efficient algorithm—the adaptive griddy Gibbs (AGG) algorithm—to address the computational issues with large spatial data sets. The proposed algorithm dramatically reduces the computational complexity. We show theoretically that the proposed method can approximate the real posterior distribution accurately. The sufficient number of grid points for a required accuracy has also been derived. We compare the performance of AGG with that of the state-of-the-art methods in simulation studies. Finally, we apply AGG to spatially indexed data concerning building energy consumption.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Density-Based Clustering Based on Hierarchical Density Estimates

Spatial machine learning: new opportunities for regional science

Article Open access 24 December 2021

Katarzyna Kopczewska

Spatial Data Management, Analysis, and Modeling in GIS: Principles and Applications

References

Banerjee, S., Gelfand, A., Finley, A., Sang, H.: Gaussian predictive process models for large spatial data sets. J. R. Stat. Soc. 70, 825–848 (2008)
Article MATH MathSciNet Google Scholar
Cressie, N.: Statistics for Spatial Data, 2nd edn. Wiley, New York (1993)
Google Scholar
Eidsvik, J., Finley, A., Banerjee, S., Rue, H.: Approximate Bayesian inference for large spatial datasets using predictive process models. Comput. Stat. Data Anal., 1362–1380 (2012)
Finley, A., Sang, H., Banerjee, S., Gelfand, A.: Improving the performance of predictive process modeling for large datasets. Comput. Stat. Data Anal. 53, 2873–2884 (2009)
Article MATH MathSciNet Google Scholar
Fuentes, M.: Approximate likelihood for large irregularly spaced spatial data. J. Am. Stat. Assoc. 102, 321–331 (2007)
Article MATH MathSciNet Google Scholar
Gelman, A., Carlin, J., Stern, H.: Bayesian Data Analysis, 2nd edn. CRC Press, Boca Raton (2009)
Google Scholar
Higdon, D.: Space and space time modeling using process convolutions. In: Anderson, C., Barnett, V., Chatwin, P.C., El-Shaarawi, A.H. (eds.) Quantitative Methods for Current Environmental Issues, pp. 37–56. Springer, London (2002)
Chapter Google Scholar
IBM: Smarter Planet Initiatives (2010). www.ibm.com/smarterplanet/global/files/us__en_us_buildings__green_buildings.pdf
Lin, X., Wahba, G., Xiang, D., Gao, F., Klein, R., Klein, B.: Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV. Ann. Stat. 28, 1570–1600 (2000)
Article MATH MathSciNet Google Scholar
Matérn, B.: Spatial Variation, 2nd edn. Springer, Berlin (1960)
Google Scholar
Paciorek, C.: Computational techniques for spatial logistic regression with large datasets. Comput. Stat. Data Anal. 51, 3631–3653 (2007)
Article MATH MathSciNet Google Scholar
Ritter, C., Tanner, M.A.: Facilitating the Gibbs sampler: the Gibbs stopper and the griddy-Gibbs sampler. J. Am. Stat. Assoc. 87, 861–868 (1992)
Article Google Scholar
Rue, H., Held, L.: Gaussian Markov Random Fields: Theory and Applications. Chapman & Hall, Boca Raton (2006)
Google Scholar
Rue, H., Martino, S., Chopin, N.: Approximate Bayesian inference for latent Gaussian models using integrated nested Laplace approximations. J. R. Stat. Soc., Ser. B, Stat. Methodol. 71, 319–392 (2009)
Article MATH MathSciNet Google Scholar
Rue, H., Tjelmeland, H.: Fitting Gaussian Markov random fields to Gaussian fields. Scand. J. Stat. 29, 31–49 (2002)
Article MATH MathSciNet Google Scholar
Stein, M.: Interpolation of Sptaital Data: Some Theory of Kriging. Springer, New York (1999)
Book Google Scholar
Stein, M., Chi, Z., Welty, L.: Approximating likelihoods for large spatial data sets. J. R. Stat. Soc. Ser. B 66, 275–296 (2004)
Article MATH MathSciNet Google Scholar
Vecchia, A.: Estimation and model identification for continuous spatial processes. J. R. Stat. Soc. Ser. B 50, 297–312 (1988)
MathSciNet Google Scholar
Ver Hoef, J., Cressie, N., Barry, R.: Flexible spatial models based on the fast Fourier transform (FFT) for cokriging. J. Comput. Graph. Stat. 13, 265–282 (2004)
Article MathSciNet Google Scholar
Walker, S., Laud, P., Zantedeschi, D., Damien, P.: Direct sampling. J. Comput. Graph. Stat. 20, 692–713 (2011)
Article MathSciNet Google Scholar
Wikle, C., Cressie, N.: A dimension-reduced approach to space-time Kalman filtering. Biometrika 86, 815–829 (1999)
Article MATH MathSciNet Google Scholar
Xia, G., Gelfand, A.: Stationary process approximation for the analysis of large spatial datasets. Technical Report, Institute of Statistics and Decision Sciences, Duke University, Durham (2006)

Download references

Acknowledgements

We appreciate Dr. Avishek Chakraborty for his very useful discussions and suggestions. This work was partially supported by Award Number R01ES017240 from the National Institute of Environmental Health Sciences. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Environmental Health Sciences or the National Institutes of Health.

Author information

Authors and Affiliations

Statistical Analysis & Forecasting, Mathematical Sciences Department, Watson Research Center (Yorktown), IBM, Yorktown Heights, NY, 10603, USA
Hongxia Yang
Department of Mathematics, Queens College City University of New York, Queens, NY, 11367-1597, USA
Fei Liu
Kuang-Chi Institute, Shenzhen, China
Chunlin Ji
Department of Statistical Science, Duke University, Durham, NC, 27708-0251, USA
David Dunson

Authors

Hongxia Yang
View author publications
You can also search for this author in PubMed Google Scholar
Fei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chunlin Ji
View author publications
You can also search for this author in PubMed Google Scholar
David Dunson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongxia Yang.

Appendix: Proofs

Proof of Theorem 1

We first show that F _c(ϕ) is well defined. For θ=(σ ²,τ ²) in the likelihood function L(ϕ,θ) in Equation (3), the marginal posterior of ϕ becomes,

$$\begin{aligned} \pi(\phi\mid \boldsymbol {Y}) \propto& \pi(\phi)\int_{ \varTheta} \mathrm{L}(\phi,\boldsymbol{\theta})\pi(\boldsymbol{\theta})\, d\boldsymbol{ \theta}, \end{aligned}$$

where Θ is the support for posterior distribution of θ. Denote R(ϕ)=∫_ΘL(ϕ,θ)π(θ) d θ. We first want to verify R(ϕ) is continuous for any ϕ∈[a,b]. Since |σ ² H(ϕ)+τ ² I _n|>τ ²ⁿ, π(θ) is product of proper priors, it then follows

(i)
The joint posterior for (ϕ,θ) is proper, so R(ϕ) is well defined.
(ii)
For any θ∈Θ and ϕ∈[a,b], L(ϕ,θ) ≤g(θ), for a function g with the property ∫_θ∈Θ g(θ)π(θ) d θ<∞.

For any sequence ϕ _n→ϕ, continuity of L(ϕ _n,θ) implies pointwise convergence to L(ϕ,θ), ∀θ∈Θ. This, along with (ii) also implies, by Dominated Convergence Theorem, R(ϕ _n)→R(ϕ), so R(ϕ) is continuous. For the continuous uniform prior π(ϕ), the marginal posterior cdf of ϕ turns out to be

$$\begin{aligned} F_C(\phi) =& \frac{1}{c_1}\int_{\phi_{\min}}^{\phi} R(\eta)\,d\eta,\quad \phi\in(\phi_{\min}, \phi_{\max}), \end{aligned}$$

where $c_{1} = \int_{\phi_{\min}}^{\phi_{\max}} R(\phi)\,d\phi$. F _C(ϕ)<∞, e.g., is well-defined.

For any ϵ>0, let k>(ϕ _max−ϕ _min)/ϵ and define

$$\begin{aligned} E_{j}=\biggl\{\phi: \frac{j - 1}{k}\leq F_C(\phi) < \frac{j}{k}\biggr\},\quad j = 1, \ldots, k. \end{aligned}$$

We further define $F_{D, k}(\phi) = \frac{j - 1}{k}$ for ϕ∈E _j. Apparently, for any ϕ∈(ϕ _min,ϕ _max), we have

$$0 \leq F_{C}(\phi) - F_{D, k}(\phi) \leq\frac{1}{k}. $$

Consequently, we obtain the following to complete the proof:

$$\begin{aligned} |F_{D,k}-F_C|_{TV} = & \int _{\phi_{\min}}^{\phi_{\max}} \bigl|F_{D,k}(\phi)-F_C( \phi)\bigr| \,d\phi \\ \leq& \frac{(\phi_{\max}-\phi_{\min})}{k} \\ \leq& \frac{(\phi_{\max}-\phi_{\min})}{(\phi_{\max} - \phi _{\min})/\epsilon} \\ \leq& \epsilon. \end{aligned}$$

□

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, H., Liu, F., Ji, C. et al. Adaptive sampling for Bayesian geospatial models. Stat Comput 24, 1101–1110 (2014). https://doi.org/10.1007/s11222-013-9422-4

Download citation

Received: 02 February 2012
Accepted: 26 August 2013
Published: 11 September 2013
Issue Date: November 2014
DOI: https://doi.org/10.1007/s11222-013-9422-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Adaptive sampling for Bayesian geospatial models

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Spatial machine learning: new opportunities for regional science

Spatial Data Management, Analysis, and Modeling in GIS: Principles and Applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Proofs

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adaptive sampling for Bayesian geospatial models

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Spatial machine learning: new opportunities for regional science

Spatial Data Management, Analysis, and Modeling in GIS: Principles and Applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Proofs

Appendix: Proofs

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation