Area-to-point Kriging in spatial hedonic pricing models

Yoo, E.-H.; Kyriakidis, P. C.

doi:10.1007/s10109-009-0090-z

Area-to-point Kriging in spatial hedonic pricing models

Original Article
Published: 13 June 2009

Volume 11, pages 381–406, (2009)
Cite this article

Journal of Geographical Systems Aims and scope Submit manuscript

E.-H. Yoo¹ &
P. C. Kyriakidis²

775 Accesses
Explore all metrics

Abstract

This paper proposes a geostatistical hedonic price model in which the effects of location on house values are explicitly modeled. The proposed geostatistical approach, namely area-to-point Kriging with External Drift (A2PKED), can take into account spatial dependence and spatial heteroskedasticity, if they exist. Furthermore, this approach has significant implications in situations where exhaustive area-averaged housing price data are available in addition to a subset of individual housing price data. In the case study, we demonstrate that A2PKED substantially improves the quality of predictions using apartment sale transaction records that occurred in Seoul, South Korea, during 2003. The improvement is illustrated via a comparative analysis, where predicted values obtained from different models, including two traditional regression-based hedonic models and a point-support geostatistical model, are compared to those obtained from the A2PKED model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spatial Hedonic Modeling of Housing Prices Using Auxiliary Maps

Hedonic real estate price estimation with the spatiotemporal geostatistical model

Article Open access 14 November 2023

Combining Property Price Predictions from Repeat Sales and Spatially Enhanced Hedonic Regressions

Article 15 August 2019

References

Anselin L (2002) Under the hood: issues in the specification and interpretation of spatial regression models. Agric Econ 27:247–267
Article Google Scholar
Anselin L (1998) Spatial econometrics: methods and models. Kluwer, Dordrecht
Google Scholar
Basu S, Thibodeau TG (1998) Analysis of spatial autocorrelation in house prices. J Real Estate Financ Econ 17:61–85
Article Google Scholar
Black SE (1999) Do better schools matter? Parental valuation of elementary education. Quart J Econ 114(2):577–599
Article Google Scholar
Can A (1990) The measurement of neighborhood dynamics in urban house prices. Econ Geogr 66(3):254–272
Article Google Scholar
Can A, Megbolugbe I (1997) Spatial dependence and house price index construction. J Real Estate Financ Econ 14:203–222
Article Google Scholar
Chica-Olmo J (2007) Prediction of housing location price by a multivariate spatial method: cokriging. J Real Estate Res 29(2):233–254
Google Scholar
Chica-Olmo J (1995) Spatial estimation of housing prices and locational rent. Urban Stud 32(8):1331–1344
Article Google Scholar
Chilès JP, Delfiner P (1999) Geostatistics: modeling spatial uncertainty. Wiley, New York
Google Scholar
Cressie N (1993) Statistics for spatial data. Wiley, New York
Google Scholar
Dubin RA (1988) Estimation of regression coefficients in the presence of spatially autocorrelated error terms. Rev Econ Stat 70:466–474
Article Google Scholar
Dubin RA (1992) Spatial autocorrelation and neighborhood quality. Region Sci Urban Econ 22:433–452
Article Google Scholar
Dubin RA (1998) Predicting house prices using multiple listings data. J Real Estate Financ Econ 17:35–59
Article Google Scholar
Dubin RA, Pace RK, Thibodeau TG (1999) Spatial augoregression techniques for real estate data. J Real Estate Lit 7:79–95
Article Google Scholar
Gelfand AE, Ecker MD, Knight JR, Sirmans CF (2004) The dynamics of location in home price. Real Estate Financ Econ 29(2):149–166
Article Google Scholar
Goodman AC, Thibodeau TG (1998) Housing market segmentation. J Hous Econ 7:121–143
Article Google Scholar
Goovaerts P (2005) Geostatistical analysis of disease data: estimation of cancer mortality risk from empirical frequencies using Poisson kriging. Int J Health Geogr 4(31)
Goovaerts P (1997) Geostatistics for natural resources evaluation. Oxford University Press, New York
Google Scholar
Gotway CA, Young LJ (2002) Combining incompatible spatial data. J Am Stat Assoc 97(458):632–648
Article Google Scholar
Jones K, Bullen N (1994) Contextual models of urban house prices: a comparison of fixed- and random-coefficient models developed by expansion. Econ Geogr 70(3):252–272
Article Google Scholar
Journel AG, Huijbregts ChJ (1978) Mining geostatistics. Academic Press, New York
Google Scholar
Kim CW, Phipps TT, Anselin L (2003) Measuring the benefits of air quality improvement: a spatial hedonic approach. J Environ Econ Manage 45:24–39
Article Google Scholar
Kutner MH, Nachtsheim CJ, Neter J, Li W (1974) Applied linear statistical models, 5th edn. McGraw-Hill, London
Kyriakidis PC (2004) A geostatistical framework for the area-to-point spatial interpolation. Geogr Anal 36(3):41–50
Article Google Scholar
Kyriakidis PC, Goodchild MF (2006) On the prediciton error of variance of three common spatial interpolation schemes. Int J Geogr Inform Sci 20(8):823–855
Article Google Scholar
LeSage JP, Pace RK (2004a) Models for spatially dependent missing data. J Real Estate Financ Econ 29(2):233–254
Article Google Scholar
LeSage JP, Pace RK (eds) (2004b) Spatial and spatiotemporal econometrics. Elsevier, Oxford
Montgomery CC, Peck EA, Vining GG (2001) Introduction to linear regression analysis. Wiley, New York
Google Scholar
Neuman SP, Jacobson EA (1984) Analysis of nonintrinsic spatial variability by residual kriging with application to regional ground water levels. Math Geol 16(5):499–521
Article Google Scholar
Orford S (2000) Modelling spatial structures in local housing market dynamics: a multilevel perspective. Urban Stud 37(9):1643–1671
Article Google Scholar
Pace RK, Gilley OW (1998) Generalizing the OLS and grid estimators. Real Estate Econ 26:331–347
Article Google Scholar
Páez A, Uchida T, Miyamoto K (2001) Spatial association and heterogeneity issues in spatial association and heterogeneity issues in land price models. Urban Stud 38(9):1493–1508
Article Google Scholar
Páez A, Long F, Farber S (2008) Moving window approaches for hedonic price estimation: An empirical comparison of modeling techniques. Urban Stud 45(8):1565–1581
Article Google Scholar
Ripley BD (1981) Spatial statistics. Wiley, New York
Book Google Scholar
Rosen S (1974) Hedonic prices and implicit markets: product differentiation in pure competition. J Polit Econ 82(1):34–55
Article Google Scholar
Yoo E-H, Kyriakidis PC (2008) Area-to-point predictions under boundary conditions. Geogr Anal 40(4):355–379
Article Google Scholar

Download references

Acknowledgments

The P. C. Kyriakidis acknowledges funding provided by the National Geospatial Intelligence Agency (NGA) under award: HM1582-07-2020.

Author information

Authors and Affiliations

Department of Geography, University at Buffalo, The State University of New York, Buffalo, NY, USA
E.-H. Yoo
Department of Geography, University of California Santa Barbara, Santa Barbara, CA, USA
P. C. Kyriakidis

Authors

E.-H. Yoo
View author publications
You can also search for this author inPubMed Google Scholar
P. C. Kyriakidis
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to E.-H. Yoo.

Appendices

Appendix 1: A2PKED as a form of generalized linear regression

Given both n point support data and K area-averaged data, the A2PKED prediction of the unknown value at a location u _p (see Eq. 2 for the original form) can be rewritten in the form of a linear regression model with correlated residuals, i.e., a point prediction of mean response and the residual, as (Chilès and Delfiner 1999):

$$ \begin{aligned} {\hat z}({{\mathbf{u}}}_p) &= {\hat \mu}({{\mathbf{u}}}_p) + {\hat r}({{\mathbf{u}}}_p) \\ &= {\hat{\varvec{\upbeta}}}^T {{\mathbf{f}}}_p^{{\mathbf{u}}} + \left[\tilde{{\varvec{\upeta}}}_p^T {{\mathbf{r}}}_{{\mathbf{u}}} + \tilde{{\varvec{\lambda}}}_p^T {{\mathbf{r}}}_s \right]\\ &= \sum_{m=0}^M {\hat\upbeta}_m f_m({{\mathbf{u}}}_p) + \left[\sum_{i=1}^{n} \tilde{\eta}_p({{\mathbf{u}}}_i) r({{\mathbf{u}}}_i)+\sum_{k=1}^{K}\tilde{\lambda}_p(s_k)r(s_k)\right]\\ \end{aligned} $$

(9)

where $\tilde{{\varvec{\upeta}}}_p$ and $\tilde{{\varvec{\lambda}}}_p$ denote, respectively, the area-to-point Simple Kriging (A2PSK) weights for n point data residuals and the K areal data residuals.

The GLS estimator of regression coefficients ${\varvec{\upbeta}}$ can be obtained as a function of the spatial correlation model of the underlying residuals ${\varvec{\Upsigma}}_R$ and the design matrix F as (Cressie 1993, p 167):

$$ {\hat{\varvec{\upbeta}}} = ({{\mathbf{F}}}^T {\varvec{\Upsigma}}_R^{-1}{{\mathbf{F}}})^{-1}{{\mathbf{F}}}^T {\varvec{\Upsigma}}_R^{-1}{{\mathbf{z}}} $$

(10)

with

$$ {{\mathbf{F}}} = \left[\begin{array}{l} {{\mathbf{F}}}_{{\mathbf{u}}} \\ {{\mathbf{F}}}_s \\ \end{array}\right], \quad {\varvec{\Upsigma}}_R = \left[\begin{array}{ll} {\varvec{\Upsigma}}_{{{\mathbf{u}}}{{\mathbf{u}}}}^R & {\varvec{\Upsigma}}_{{{\mathbf{u}}}s}^R\\ {\varvec{\Upsigma}}_{s{{\mathbf{u}}}}^R & {\varvec{\Sigma}}_{ss}^R\\ \end{array}\right], \quad {{\mathbf{z}}} = \left[\begin{array}{l} {{\mathbf{z}}}_{{\mathbf{u}}}\\ {{\mathbf{z}}}_s\\ \end{array}\right] $$

(11)

Note that when ${\varvec{\Upsigma}}_R$ is diagonal, with constant entries along its main diagonal, the drift prediction obtained from Eq. 1 coincides with that of OLS. Again, the unique solution of estimate ${\hat{\varvec{\upbeta}}}$ requires the ((n + K) × (n + K)) matrix of variogram values of residuals ${\varvec{\Upsigma}}_R$ to be non-singular and the ((n + K) × (M + 1)) design matrix F to be of full column rank.

Once the drift component is predicted as ${\hat \mu}({{\mathbf{u}}}_p) = {\hat {\varvec{\upbeta}}}^T{{\mathbf{f}}}_p^{{\mathbf{u}}}$ using the GLS estimator of regression coefficients ${\hat{\varvec{\upbeta}}},$ we can predict unknown residual component ${\hat r}({{\mathbf{u}}}_p)$ at the prediction location u _p using A2PSK. The A2PSK prediction at u _p is a weighted linear combination of the n “point data” residuals ${{\mathbf{r}}}_{{\mathbf{u}}} = [r({{\mathbf{u}}}_i) = z({{\mathbf{u}}}_i) - \sum_{m=0}^M f_m({{\mathbf{u}}}_i){\hat \upbeta}_m,\;i = 1, \ldots, n]^T$ and the “areal data” residuals ${{\mathbf{r}}}_s = [r(s_k) = z(s_k) - \sum_{m=0}^M f_m(s_k){\hat{\upbeta}}_m, k = 1, \ldots, K]^T$ as:

$$ {\hat r}({{\mathbf{u}}}_p) = \tilde{{\varvec{\upeta}}}_p^T {{\mathbf{r}}}_{{\mathbf{u}}} + \tilde{{\varvec{\lambda}}}_p^T {{\mathbf{r}}}_s = \sum_{i=1}^{n} \tilde{\eta}_p({{\mathbf{u}}}_i) r({{\mathbf{u}}}_i) + \sum_{k=1}^{K} \tilde{\lambda}_p(s_k)r(s_k). $$

(12)

Here, the A2PSK weights $\tilde{{\varvec{\upeta}}}_p = [{\tilde{\eta}}_p({{\mathbf{u}}}_i), i=1, \ldots, n]^T$ for the n point data and the weight $\tilde{\varvec{\lambda}}_p = [\tilde \lambda_p(s_k), k = 1, \ldots, K]^T$ for the areal data are determined per solution of a A2PSK system similar to that of Eq. 4 as:

$$ \left[\begin{array}{ll} {\varvec{\Upsigma}}_{{{\mathbf{u}}}{{\mathbf{u}}}}^R & {\varvec{\Upsigma}}_{{{\mathbf{u}}}s}^R \\ {\varvec{\Upsigma}}_{s{{\mathbf{u}}}}^R & {\varvec{\Upsigma}}_{ss}^R\\ \end{array}\right] \left[\begin{array}{l} \tilde{{\varvec{\eta}}}_p \\ {\tilde{\varvec{\lambda}}}_p\\ \end{array}\right]= \left[\begin{array}{l} {\varvec{\sigma}}_p^{{\mathbf{u}}} \\ {\varvec{\sigma}}_p^s\\ \end{array}\right] $$

(13)

where the covariance model of residuals involved in above A2PSK system is assumed to be identical to that used in Eq. 4.

In summary, A2PKED is equivalent to optimum drift estimation followed by area-to-point simple Kriging of the residuals from this drift estimate, as if the mean were known. The A2PKED prediction error variance accounts for the fact that the drift is actually unknown but estimated.

Appendix 2: Estimation of a local drift coefficients in A2PKED

Consider the task of predicting the unknown value at location u _p using n point data and a subset of the areal data instead of all K such data. This modification of the original A2PKED system in Eq. 4 may be necessary when dealing with large data sets of point support as well as areal support or when statistical modeling of spatial variation, i.e., spatially varying regression model coefficients, is adopted.

In our case study, we include only one areal datum in which the prediction location u _p falls, which may affect the efficiency of the GLS estimator as well as the residual component. This change of the original system may yield some problems in the design matrix, particularly when dummy variables associated with areal data are included. However, this change does not destroy the desirable properties of Kriging prediction, such as unbiasedness and minimum prediction error variance. In what follows, we present the A2PKED prediction at location u _p as a linear regression form with correlated residuals based on n point data and a single areal datum:

$$ {\hat z}({{\mathbf{u}}}_p) = \sum_{m=0}^M {\hat \upbeta}_m({{\mathbf{u}}}_p) f_m({{\mathbf{u}}}_p) + \left[\sum_{i=1}^{n} \tilde{\eta}_p({{\mathbf{u}}}_i) r({{\mathbf{u}}}_i) + \tilde{\lambda}_p(s_k)r(s_k) \right] $$

(14)

where ${\tilde \eta}_p({{\mathbf{u}}}_i)$ and ${\tilde \lambda}_p(s_k)$ denote, respectively, the A2PSK weight assigned to the ith point support residual $r({{\mathbf{u}}}_p) = z({{\mathbf{u}}}_p)-{\hat \mu}({{\mathbf{u}}}_p)$ and the kth areal residual $r(s_k) = z(s_k) - {\hat \mu}(s_k)$ at the region s _k in which the prediction location u _p falls. Note that the A2PSK weights for point data and the single areal datum in Eq. 14 need to be updated at each prediction location as the areal datum associated with each prediction location is subject to change. This amounts to estimate a spatially varying (local) linear drift component whose GLS regression coefficients are constant within each neighborhood, but vary from one area to another. For example, the GLS regression coefficients ${\hat{\upbeta}}_m({{\mathbf{u}}}_p)$ with m = 0,…, M at location u _p are different from those ${\hat \upbeta}_m({{\mathbf{u}}}_{p'})$ at location u _p′ if u _p′ ∈ s _k′ for k ≠ k′.

Typically, the application of GLS to hedonic price models assumes that the relationship between house price and covariates is fixed over the study area. The proposed approach takes into account heteroskedasticity present in house prices so that the implicit price of housing attributes varies spatially by submarkets defined by areal units (Orford 2000).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yoo, EH., Kyriakidis, P.C. Area-to-point Kriging in spatial hedonic pricing models. J Geogr Syst 11, 381–406 (2009). https://doi.org/10.1007/s10109-009-0090-z

Download citation

Received: 16 October 2008
Accepted: 18 May 2009
Published: 13 June 2009
Issue Date: December 2009
DOI: https://doi.org/10.1007/s10109-009-0090-z

Keywords

JEL Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Area-to-point Kriging in spatial hedonic pricing models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Spatial Hedonic Modeling of Housing Prices Using Auxiliary Maps

Hedonic real estate price estimation with the spatiotemporal geostatistical model

Combining Property Price Predictions from Repeat Sales and Spatially Enhanced Hedonic Regressions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: A2PKED as a form of generalized linear regression

Appendix 2: Estimation of a local drift coefficients in A2PKED

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Subscribe and save

Buy Now