Skip to main content
Log in

Imputing censored data with desirable spatial covariance function properties using simulated annealing

  • Original Article
  • Published:
Journal of Geographical Systems Aims and scope Submit manuscript

Abstract

When measurements of values that are less than the limit of detection are reported as not detected, the data are referred to as censored. The non-recording of values below the limit of detection is common in soil science research although modelling data affected by censoring can be problematic. This paper develops and tests a modified version of Spatial Simulated Annealing, called Simulated Annealing by Variogram and Histogram form, for drawing values for censored points given a mixed set of observed and censored data. The algorithm aims to maximise the goodness of fitting between the experimental and theoretical variograms (by allowing variation in its parameters) while the imputed values are constrained to a target histogram form. In practice, the experimental histogram is estimated by transforming the available data (interval and exact observations) to quantiles and fitting a plausible distribution. The theoretical distribution of the data is used to constrain the variogram fitting. The proposed simulated annealing method is designed to find the optimal spatial arrangement of values, given by the lowest errors in variogram and histogram fitting and kriging prediction. The accuracy of the method proposed is assessed on a simulated data set in which the censored point values are known and compared with the Spatial Simulated Annealing algorithm. According to the results obtained, the Simulated Annealing by Variogram and Histogram form (SAVH) approach can be recommended as a useful tool for the analysis of spatially distributed data with censoring.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. The R project. www.r-project.org.

  2. The Apulia Region. http://bdt.regione.puglia.it/home.html.

References

  • Aarts E, Korst J (1989) Simulated annealing and Boltzmann machines—a stochastic approach to combinatorial optimization and neural computing. Wiley & Sons, New York

    Google Scholar 

  • Abrahamsen P, Benth FE (2001) Kriging with inequality constraints. Math Geol 33(6):719–744

    Article  Google Scholar 

  • Agarwal R, Sharma M (2003) Parameter estimation for non-linear environmental models using below-detection data. Ad Environ Res 7(2):249–261

    Article  Google Scholar 

  • Alkhamis TM, Ahmed MA (2004) Simulation-based optimization using simulated annealing with confidence interval. In: Proceedings 2004 Winter simulation conference, IEEE, Washington D.C., pp 514–519

  • Bang H, Tsiatis AA (2002) Median regression with censored cost data. Biometrics 58(3):643–649

    Article  Google Scholar 

  • Bölte A, Thonemann UW (1996) Optimizing simulated annealing schedules with genetic programming. Eur J Oper Res 92(2):402–416

    Article  Google Scholar 

  • Bouktif S, Sahraoui H, Antoniol G (2006) Simulated annealing for improving software quality prediction. In: Proceedings genetic and evolutionary computation conference GECCO 06, ACM, New York, pp 1893–1991

  • Box GEP, Cox DR (1964) An analysis of transformations. J Roy Stat Soc B 26(2):211–252

    Google Scholar 

  • Caudill SB (1996) Maximum likelihood estimation in a model with interval data: a comment and extension. J Appl Stat 23(1):97–104

    Article  Google Scholar 

  • Christakos G, Killam BR (1993) Sampling design for classifying contaminant level using annealing search algorithms. Water Resour Res 29(12):4063–4076

    Article  Google Scholar 

  • Corana A, Marchesi M, Martini C, Ridella S (1987) Minimizing multimodal functions of continuous variables with the “simulated annealing” algorithm. ACM T Math Software 13(3):262–280

    Article  Google Scholar 

  • De Oliveira V (2005) Bayesian inference and prediction of Gaussian random fields based on censored data. J Comput Graph Stat 14(1):95–115

    Article  Google Scholar 

  • Dennis JE, Schnabel RB (1983) Numerical methods for unconstrained optimization and nonlinear equations. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  • Deutsch CV, Journel AG (1998) GSLIB Geostatistical Software Library and user’s guide, 2nd edn. Oxford University Press, New York

    Google Scholar 

  • Deutsch CV, Wen XH (1998) An improved perturbation mechanism for simulated annealing simulation. Math Geol 30(7):801–816

    Article  Google Scholar 

  • Dueck G, Scheuer T (1990) Threshold accepting: a general purpose optimization algorithm appearing superior to simulated annealing. J Comput Phys 90(1):161–175

    Article  Google Scholar 

  • Fridley BL, Dixon P (2007) Data augmentation for a Bayesian spatial model involving censored observations. Environmetrics 18(2):107–123

    Article  Google Scholar 

  • Gelman A, Roberts G, Gilks W (1995) Efficient Metropolis jumping rules. In: Bernardo JM, Berger JO, Dawid AP, Smith AFM (eds) Bayesian statistics, vol 5. Oxford University Press, New York, pp 599–608

    Google Scholar 

  • Geman S, Geman D (1984) Stochastic relaxation, Gibbs distribution and the Bayesian restoration of images. IEEE T Pattern Anal 6(6):721–741

    Article  Google Scholar 

  • Gibbons R (1995) Some statistical and conceptual issues in the detection of low-level environmental pollutants. Environ Ecol Stat 2(2):125–167

    Article  Google Scholar 

  • Gilliom RJ, Helsel DR (1984) Estimation of distributional parameters of censored trace-level water quality data. Water Resour Res 22(2):147–155

    Google Scholar 

  • Goovaerts P (2009) AUTO-IK: a 2D indicator kriging program for the automated non-parametric modeling of local uncertainty in earth sciences. Comput Geosci 35(6):1255–1270

    Article  Google Scholar 

  • Gringarten E, Deutsch CV (2001) Theacher’s aide. Variogram interpretation and modeling. Math Geol 27(5):659–672

    Google Scholar 

  • Helsel DR (2005) Nondetects and data analysis. Wiley & Sons, New York

    Google Scholar 

  • Holla MS (1966) On a poisson-inverse Gaussian distribution. Metrika 11(1):115–121

    Article  Google Scholar 

  • Hopke PK, Liu C, Rubin DB (2001) Multiple imputation for multivariate data with missing and below-threshold measurements: time-series concentrations of pollutants in the Arctic. Biometrics 57(1):22–33

    Article  Google Scholar 

  • Huzurbazar AV (2005) A censored data histogram. Commun Stat Simulat 34(1):113–120

    Article  Google Scholar 

  • Ingber L (1996) Adaptive simulated annealing (ASA): lessons learned. J Control Cybern 25(1):33–54

    Google Scholar 

  • Kerry R, Oliver MA (2007a) Determining the effect of asymmetric data on the variogram. I. Underlying asymmetry. Comput Geosci 33(10):1212–1232

    Article  Google Scholar 

  • Kerry R, Oliver MA (2007b) Determining the effect of asymmetric data on the variogram. II. Outliers. Comput Geosci 33(10):1233–1260

    Article  Google Scholar 

  • Kerry R, Oliver MA (2007c) Comparing sampling needs for variograms of soil properties computed by method of moments and residual maximum likelihood. Geoderma 140(10):383–396

    Article  Google Scholar 

  • Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680

    Article  Google Scholar 

  • Knotters M, Brus DJ, Oude-Voshaar JH (1995) A comparison of kriging, co-kriging and kriging combined with regression for spatial interpolation of horizon depth with censored observations. Geoderma 67(3–4):227–246

    Article  Google Scholar 

  • Kuo SF, Liu CW, Merkley GP (2001) Application of the simulated annealing method to agricultural water resource management. J Agr Eng Res 80(1):109–124

    Article  Google Scholar 

  • Lark RM (2000a) A comparison of some robust estimators of the variogram for use in soil survey. Eur J Soil Sci 51(1):137–157

    Article  Google Scholar 

  • Lark RM (2000b) Estimating variograms of soil properties by the method-of-moments and maximum likelihood. Eur J Soil Sci 51(4):717–728

    Article  Google Scholar 

  • Lark RM, Papritz A (2003) Fitting a linear model of coregionalization for soil properties using simulated annealing. Geoderma 115(3–4):245–260

    Article  Google Scholar 

  • Leuangthong O, Deutsch CV (2003) Stepwise conditional transformation for simulation of multiple variables. Math Geol 35(2):155–173

    Article  Google Scholar 

  • Liu C (2001) The art of data augmentation: discussion. J Comput Graph Stat 10(1):75–81

    Article  Google Scholar 

  • Macmillan W (2001) Redistricting in a GIS environment: an optimisation algorithm using switching-points. J Geogr Syst 3(2):167–180

    Article  Google Scholar 

  • Marcotte D (1995) Generalized cross-validation for covariance model selection. Math Geol 27(5):659–672

    Article  Google Scholar 

  • McBratney AB, Webster R (1986) Choosing functions for semivariograms of soil properties and fitting them to sampling estimates. J Soil Sci 37(4):617–639

    Article  Google Scholar 

  • Meng XL, Van Dyk DA (1999) Seeking efficient data augmentation schemes via conditional and marginal augmentation. Biometrika 86(2):301–320

    Article  Google Scholar 

  • Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21(6):1087–1092

    Article  Google Scholar 

  • Militino AF, Ugarte MD (1999) Analyzing censored spatial data. Math Geol 31(5):551–561

    Article  Google Scholar 

  • Ministero per le Politiche Agricole e Forestali (1999) Metodi ufficiali di analisi chimica del suolo. Gazzetta Ufficiale Supplemento Ordinario 248:1–162

    Google Scholar 

  • Odell PM, Anderson KM, D’Agostino RB (1992) Maximum likelihood estimation for interval censored data using a weibull-based accelerated failure time model. Biometrics 48(3):951–959

    Article  Google Scholar 

  • Oliver MA (2010) The variogram and kriging. In: Fischer MM, Getis A (eds) Handbook of applied spatial analysis. Springer, Berlin, pp 319–352

    Chapter  Google Scholar 

  • Orton TG, Lark RM (2007) Estimating the local mean for Bayesian maximum entropy by generalized least squares and maximum likelihood, and an application to the spatial analysis of a censored soil variable. Eur J Soil Sci 58(1):60–73

    Article  Google Scholar 

  • Pardo-Igúzquiza E (1998) Optimal selection of number and location of rainfall gauges for areal rainfall estimation using geostatistics and simulated annealing. J Hydrol 210(1–4):206–220

    Article  Google Scholar 

  • Porter PS, Ward RC, Bell HF (1988) The detection limit. Water quality monitoring data are plagued with levels of chemicals that are too low to be measured precisely. Environ Sci Technol 22(8):856–861

    Article  Google Scholar 

  • Raimo F, Napolitano A (2003) Studio della distribuzione spaziale di alcuni parametri chimici. Il Tabacco 11:11–17

    Google Scholar 

  • Rajasekaran S (2000) On simulated annealing and nested annealing. J Global Optim 16(1):43–56

    Article  Google Scholar 

  • Ribeiro PJ Jr, Diggle PJ (2001) geoR: a package for geostatistical analysis. R-NEWS 1(2):15–18

    Google Scholar 

  • Rigby RA, Stasinopoulos DM (2005) Generalized additive models for location, scale and shape, (with discussion). Appl Stat-J Roy St C 54(3):507–554

    Article  Google Scholar 

  • Rivoirard J (1994) Introduction to disjunctive kriging and non-linear geostatistics. Clarendon, Oxford

    Google Scholar 

  • Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley & Sons, New York

    Book  Google Scholar 

  • Saito H, Goovaerts P (2000) Geostatistical interpolation of positively skewed and censored data in a Dioxin-contaminated site. Environ Sci Technol 34(19):4228–4235

    Article  Google Scholar 

  • Stein ML (1992) Prediction and inference for truncated spatial data. J Comput Graph Stat 1(1):354–372

    Google Scholar 

  • Svensson I, Sjöstedt-De Luna S, Bondesson L (2006) Estimation of wood fibre length distributions from censored data through an EM algorithm. Scand J Stat 33(3):503–522

    Article  Google Scholar 

  • Triki E, Collette Y, Siarry P (2005) A theoretical study on the behaviour of simulated annealing leading to a new cooling schedule. Eur J Oper Res 166(1):77–92

    Article  Google Scholar 

  • Tsiatis AA (1990) Estimating regression parameters using linear regression rank test for censored data. Ann Stat 18(1):354–372

    Article  Google Scholar 

  • Van Breemen N, Mulder J, Driscoll CT (1983) Acidification and alkalinization of soils. Plant Soil 75(3):283–308

    Article  Google Scholar 

  • Webster R, Oliver MA (2001) Geostatistics for environmental scientists. Wiley & Sons, Chichester

    Google Scholar 

Download references

Acknowledgments

This study was supported by a fellowship from the Master and Back program financed by the Regional Sardinia Government, under agreement between the School of Geography, University of Southampton (UK) and the Dipartimento di Economia e Sistemi Arbori, University of Sassari (Italy). Thanks are due to the Apulia Regional Authority for Ecology and the Water Research Institute of the National Research Council for providing the data used in this study. Finally, the authors would like to acknowledge Dr. Edith Cheng at the University of Southampton, for inspiring the analysis and for helpful assistance.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to L. Sedda.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOC 54 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sedda, L., Atkinson, P.M., Barca, E. et al. Imputing censored data with desirable spatial covariance function properties using simulated annealing. J Geogr Syst 14, 265–282 (2012). https://doi.org/10.1007/s10109-010-0145-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10109-010-0145-1

Keywords

JEL Classification

Navigation