Skip to main content
Log in

Area-to-point Kriging in spatial hedonic pricing models

  • Original Article
  • Published:
Journal of Geographical Systems Aims and scope Submit manuscript

Abstract

This paper proposes a geostatistical hedonic price model in which the effects of location on house values are explicitly modeled. The proposed geostatistical approach, namely area-to-point Kriging with External Drift (A2PKED), can take into account spatial dependence and spatial heteroskedasticity, if they exist. Furthermore, this approach has significant implications in situations where exhaustive area-averaged housing price data are available in addition to a subset of individual housing price data. In the case study, we demonstrate that A2PKED substantially improves the quality of predictions using apartment sale transaction records that occurred in Seoul, South Korea, during 2003. The improvement is illustrated via a comparative analysis, where predicted values obtained from different models, including two traditional regression-based hedonic models and a point-support geostatistical model, are compared to those obtained from the A2PKED model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Anselin L (2002) Under the hood: issues in the specification and interpretation of spatial regression models. Agric Econ 27:247–267

    Article  Google Scholar 

  • Anselin L (1998) Spatial econometrics: methods and models. Kluwer, Dordrecht

    Google Scholar 

  • Basu S, Thibodeau TG (1998) Analysis of spatial autocorrelation in house prices. J Real Estate Financ Econ 17:61–85

    Article  Google Scholar 

  • Black SE (1999) Do better schools matter? Parental valuation of elementary education. Quart J Econ 114(2):577–599

    Article  Google Scholar 

  • Can A (1990) The measurement of neighborhood dynamics in urban house prices. Econ Geogr 66(3):254–272

    Article  Google Scholar 

  • Can A, Megbolugbe I (1997) Spatial dependence and house price index construction. J Real Estate Financ Econ 14:203–222

    Article  Google Scholar 

  • Chica-Olmo J (2007) Prediction of housing location price by a multivariate spatial method: cokriging. J Real Estate Res 29(2):233–254

    Google Scholar 

  • Chica-Olmo J (1995) Spatial estimation of housing prices and locational rent. Urban Stud 32(8):1331–1344

    Article  Google Scholar 

  • Chilès JP, Delfiner P (1999) Geostatistics: modeling spatial uncertainty. Wiley, New York

    Google Scholar 

  • Cressie N (1993) Statistics for spatial data. Wiley, New York

    Google Scholar 

  • Dubin RA (1988) Estimation of regression coefficients in the presence of spatially autocorrelated error terms. Rev Econ Stat 70:466–474

    Article  Google Scholar 

  • Dubin RA (1992) Spatial autocorrelation and neighborhood quality. Region Sci Urban Econ 22:433–452

    Article  Google Scholar 

  • Dubin RA (1998) Predicting house prices using multiple listings data. J Real Estate Financ Econ 17:35–59

    Article  Google Scholar 

  • Dubin RA, Pace RK, Thibodeau TG (1999) Spatial augoregression techniques for real estate data. J Real Estate Lit 7:79–95

    Article  Google Scholar 

  • Gelfand AE, Ecker MD, Knight JR, Sirmans CF (2004) The dynamics of location in home price. Real Estate Financ Econ 29(2):149–166

    Article  Google Scholar 

  • Goodman AC, Thibodeau TG (1998) Housing market segmentation. J Hous Econ 7:121–143

    Article  Google Scholar 

  • Goovaerts P (2005) Geostatistical analysis of disease data: estimation of cancer mortality risk from empirical frequencies using Poisson kriging. Int J Health Geogr 4(31)

  • Goovaerts P (1997) Geostatistics for natural resources evaluation. Oxford University Press, New York

    Google Scholar 

  • Gotway CA, Young LJ (2002) Combining incompatible spatial data. J Am Stat Assoc 97(458):632–648

    Article  Google Scholar 

  • Jones K, Bullen N (1994) Contextual models of urban house prices: a comparison of fixed- and random-coefficient models developed by expansion. Econ Geogr 70(3):252–272

    Article  Google Scholar 

  • Journel AG, Huijbregts ChJ (1978) Mining geostatistics. Academic Press, New York

    Google Scholar 

  • Kim CW, Phipps TT, Anselin L (2003) Measuring the benefits of air quality improvement: a spatial hedonic approach. J Environ Econ Manage 45:24–39

    Article  Google Scholar 

  • Kutner MH, Nachtsheim CJ, Neter J, Li W (1974) Applied linear statistical models, 5th edn. McGraw-Hill, London

  • Kyriakidis PC (2004) A geostatistical framework for the area-to-point spatial interpolation. Geogr Anal 36(3):41–50

    Article  Google Scholar 

  • Kyriakidis PC, Goodchild MF (2006) On the prediciton error of variance of three common spatial interpolation schemes. Int J Geogr Inform Sci 20(8):823–855

    Article  Google Scholar 

  • LeSage JP, Pace RK (2004a) Models for spatially dependent missing data. J Real Estate Financ Econ 29(2):233–254

    Article  Google Scholar 

  • LeSage JP, Pace RK (eds) (2004b) Spatial and spatiotemporal econometrics. Elsevier, Oxford

  • Montgomery CC, Peck EA, Vining GG (2001) Introduction to linear regression analysis. Wiley, New York

    Google Scholar 

  • Neuman SP, Jacobson EA (1984) Analysis of nonintrinsic spatial variability by residual kriging with application to regional ground water levels. Math Geol 16(5):499–521

    Article  Google Scholar 

  • Orford S (2000) Modelling spatial structures in local housing market dynamics: a multilevel perspective. Urban Stud 37(9):1643–1671

    Article  Google Scholar 

  • Pace RK, Gilley OW (1998) Generalizing the OLS and grid estimators. Real Estate Econ 26:331–347

    Article  Google Scholar 

  • Páez A, Uchida T, Miyamoto K (2001) Spatial association and heterogeneity issues in spatial association and heterogeneity issues in land price models. Urban Stud 38(9):1493–1508

    Article  Google Scholar 

  • Páez A, Long F, Farber S (2008) Moving window approaches for hedonic price estimation: An empirical comparison of modeling techniques. Urban Stud 45(8):1565–1581

    Article  Google Scholar 

  • Ripley BD (1981) Spatial statistics. Wiley, New York

    Book  Google Scholar 

  • Rosen S (1974) Hedonic prices and implicit markets: product differentiation in pure competition. J Polit Econ 82(1):34–55

    Article  Google Scholar 

  • Yoo E-H, Kyriakidis PC (2008) Area-to-point predictions under boundary conditions. Geogr Anal 40(4):355–379

    Article  Google Scholar 

Download references

Acknowledgments

The P. C. Kyriakidis acknowledges funding provided by the National Geospatial Intelligence Agency (NGA) under award: HM1582-07-2020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to E.-H. Yoo.

Appendices

Appendix 1: A2PKED as a form of generalized linear regression

Given both n point support data and K area-averaged data, the A2PKED prediction of the unknown value at a location u p (see Eq. 2 for the original form) can be rewritten in the form of a linear regression model with correlated residuals, i.e., a point prediction of mean response and the residual, as (Chilès and Delfiner 1999):

$$ \begin{aligned} {\hat z}({{\mathbf{u}}}_p) &= {\hat \mu}({{\mathbf{u}}}_p) + {\hat r}({{\mathbf{u}}}_p) \\ &= {\hat{\varvec{\upbeta}}}^T {{\mathbf{f}}}_p^{{\mathbf{u}}} + \left[\tilde{{\varvec{\upeta}}}_p^T {{\mathbf{r}}}_{{\mathbf{u}}} + \tilde{{\varvec{\lambda}}}_p^T {{\mathbf{r}}}_s \right]\\ &= \sum_{m=0}^M {\hat\upbeta}_m f_m({{\mathbf{u}}}_p) + \left[\sum_{i=1}^{n} \tilde{\eta}_p({{\mathbf{u}}}_i) r({{\mathbf{u}}}_i)+\sum_{k=1}^{K}\tilde{\lambda}_p(s_k)r(s_k)\right]\\ \end{aligned} $$
(9)

where \(\tilde{{\varvec{\upeta}}}_p\) and \(\tilde{{\varvec{\lambda}}}_p\) denote, respectively, the area-to-point Simple Kriging (A2PSK) weights for n point data residuals and the K areal data residuals.

The GLS estimator of regression coefficients \({\varvec{\upbeta}}\) can be obtained as a function of the spatial correlation model of the underlying residuals \({\varvec{\Upsigma}}_R\) and the design matrix F as (Cressie 1993, p 167):

$$ {\hat{\varvec{\upbeta}}} = ({{\mathbf{F}}}^T {\varvec{\Upsigma}}_R^{-1}{{\mathbf{F}}})^{-1}{{\mathbf{F}}}^T {\varvec{\Upsigma}}_R^{-1}{{\mathbf{z}}} $$
(10)

with

$$ {{\mathbf{F}}} = \left[\begin{array}{l} {{\mathbf{F}}}_{{\mathbf{u}}} \\ {{\mathbf{F}}}_s \\ \end{array}\right], \quad {\varvec{\Upsigma}}_R = \left[\begin{array}{ll} {\varvec{\Upsigma}}_{{{\mathbf{u}}}{{\mathbf{u}}}}^R & {\varvec{\Upsigma}}_{{{\mathbf{u}}}s}^R\\ {\varvec{\Upsigma}}_{s{{\mathbf{u}}}}^R & {\varvec{\Sigma}}_{ss}^R\\ \end{array}\right], \quad {{\mathbf{z}}} = \left[\begin{array}{l} {{\mathbf{z}}}_{{\mathbf{u}}}\\ {{\mathbf{z}}}_s\\ \end{array}\right] $$
(11)

Note that when \({\varvec{\Upsigma}}_R\) is diagonal, with constant entries along its main diagonal, the drift prediction obtained from Eq. 1 coincides with that of OLS. Again, the unique solution of estimate \({\hat{\varvec{\upbeta}}}\) requires the ((n + K) × (n + K)) matrix of variogram values of residuals \({\varvec{\Upsigma}}_R\) to be non-singular and the ((n + K) × (M + 1)) design matrix F to be of full column rank.

Once the drift component is predicted as \({\hat \mu}({{\mathbf{u}}}_p) = {\hat {\varvec{\upbeta}}}^T{{\mathbf{f}}}_p^{{\mathbf{u}}}\) using the GLS estimator of regression coefficients \({\hat{\varvec{\upbeta}}},\) we can predict unknown residual component \({\hat r}({{\mathbf{u}}}_p)\) at the prediction location u p using A2PSK. The A2PSK prediction at u p is a weighted linear combination of the n “point data” residuals \({{\mathbf{r}}}_{{\mathbf{u}}} = [r({{\mathbf{u}}}_i) = z({{\mathbf{u}}}_i) - \sum_{m=0}^M f_m({{\mathbf{u}}}_i){\hat \upbeta}_m,\;i = 1, \ldots, n]^T\) and the “areal data” residuals \({{\mathbf{r}}}_s = [r(s_k) = z(s_k) - \sum_{m=0}^M f_m(s_k){\hat{\upbeta}}_m, k = 1, \ldots, K]^T\) as:

$$ {\hat r}({{\mathbf{u}}}_p) = \tilde{{\varvec{\upeta}}}_p^T {{\mathbf{r}}}_{{\mathbf{u}}} + \tilde{{\varvec{\lambda}}}_p^T {{\mathbf{r}}}_s = \sum_{i=1}^{n} \tilde{\eta}_p({{\mathbf{u}}}_i) r({{\mathbf{u}}}_i) + \sum_{k=1}^{K} \tilde{\lambda}_p(s_k)r(s_k). $$
(12)

Here, the A2PSK weights \(\tilde{{\varvec{\upeta}}}_p = [{\tilde{\eta}}_p({{\mathbf{u}}}_i), i=1, \ldots, n]^T\) for the n point data and the weight \(\tilde{\varvec{\lambda}}_p = [\tilde \lambda_p(s_k), k = 1, \ldots, K]^T\) for the areal data are determined per solution of a A2PSK system similar to that of Eq. 4 as:

$$ \left[\begin{array}{ll} {\varvec{\Upsigma}}_{{{\mathbf{u}}}{{\mathbf{u}}}}^R & {\varvec{\Upsigma}}_{{{\mathbf{u}}}s}^R \\ {\varvec{\Upsigma}}_{s{{\mathbf{u}}}}^R & {\varvec{\Upsigma}}_{ss}^R\\ \end{array}\right] \left[\begin{array}{l} \tilde{{\varvec{\eta}}}_p \\ {\tilde{\varvec{\lambda}}}_p\\ \end{array}\right]= \left[\begin{array}{l} {\varvec{\sigma}}_p^{{\mathbf{u}}} \\ {\varvec{\sigma}}_p^s\\ \end{array}\right] $$
(13)

where the covariance model of residuals involved in above A2PSK system is assumed to be identical to that used in Eq. 4.

In summary, A2PKED is equivalent to optimum drift estimation followed by area-to-point simple Kriging of the residuals from this drift estimate, as if the mean were known. The A2PKED prediction error variance accounts for the fact that the drift is actually unknown but estimated.

Appendix 2: Estimation of a local drift coefficients in A2PKED

Consider the task of predicting the unknown value at location u p using n point data and a subset of the areal data instead of all K such data. This modification of the original A2PKED system in Eq. 4 may be necessary when dealing with large data sets of point support as well as areal support or when statistical modeling of spatial variation, i.e., spatially varying regression model coefficients, is adopted.

In our case study, we include only one areal datum in which the prediction location u p falls, which may affect the efficiency of the GLS estimator as well as the residual component. This change of the original system may yield some problems in the design matrix, particularly when dummy variables associated with areal data are included. However, this change does not destroy the desirable properties of Kriging prediction, such as unbiasedness and minimum prediction error variance. In what follows, we present the A2PKED prediction at location u p as a linear regression form with correlated residuals based on n point data and a single areal datum:

$$ {\hat z}({{\mathbf{u}}}_p) = \sum_{m=0}^M {\hat \upbeta}_m({{\mathbf{u}}}_p) f_m({{\mathbf{u}}}_p) + \left[\sum_{i=1}^{n} \tilde{\eta}_p({{\mathbf{u}}}_i) r({{\mathbf{u}}}_i) + \tilde{\lambda}_p(s_k)r(s_k) \right] $$
(14)

where \({\tilde \eta}_p({{\mathbf{u}}}_i)\) and \({\tilde \lambda}_p(s_k)\) denote, respectively, the A2PSK weight assigned to the ith point support residual \(r({{\mathbf{u}}}_p) = z({{\mathbf{u}}}_p)-{\hat \mu}({{\mathbf{u}}}_p)\) and the kth areal residual \(r(s_k) = z(s_k) - {\hat \mu}(s_k)\) at the region s k in which the prediction location u p falls. Note that the A2PSK weights for point data and the single areal datum in Eq. 14 need to be updated at each prediction location as the areal datum associated with each prediction location is subject to change. This amounts to estimate a spatially varying (local) linear drift component whose GLS regression coefficients are constant within each neighborhood, but vary from one area to another. For example, the GLS regression coefficients \({\hat{\upbeta}}_m({{\mathbf{u}}}_p)\) with m = 0,…, M at location u p are different from those \({\hat \upbeta}_m({{\mathbf{u}}}_{p'})\) at location u p if u p ∈ s k for k ≠ k′.

Typically, the application of GLS to hedonic price models assumes that the relationship between house price and covariates is fixed over the study area. The proposed approach takes into account heteroskedasticity present in house prices so that the implicit price of housing attributes varies spatially by submarkets defined by areal units (Orford 2000).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yoo, EH., Kyriakidis, P.C. Area-to-point Kriging in spatial hedonic pricing models. J Geogr Syst 11, 381–406 (2009). https://doi.org/10.1007/s10109-009-0090-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10109-009-0090-z

Keywords

JEL Classification

Navigation