Skip to main content
Log in

Modeling and elucidation of housing price

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

It is widely acknowledged that the value of a house is the mixture of a large number of characteristics. House price prediction thus presents a unique set of challenges in practice. While a large body of works are dedicated to this task, their performance and applications have been limited by the shortage of long time span of transaction data, the absence of real-world settings and the insufficiency of housing features. To this end, a time-aware latent hierarchical model is developed to capture underlying spatiotemporal interactions behind the evolution of house prices. The hierarchical perspective obviates the need for historical transaction data of exactly same houses when temporal effects are considered. The proposed framework is examined on a large-scale dataset of the property transaction in Beijing. The whole experimental procedure strictly conforms to the real-world scenario. The empirical evaluation results demonstrate the outperformance of our approach over alternative competitive methods. We also group housing features into both external and internal clusters. The further experiment unveils that external component shapes house prices much more heavily than the internal one does. More interestingly, the inference of latent neighborhood value in our model is empirically shown to be able to lessen the dependence on the critical external cluster of features in house price prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://bj.lianjia.com.

  2. https://en.wikipedia.org/wiki/neighborhood.

  3. http://us.spindices.com/index-family/real-estate/sp-case-shiller.

  4. https://www.dropbox.com/s/isdw106x6hjwfkf/data_House_Price.csv?dl=0.

  5. https://en.wikipedia.org/wiki/District_(China).

  6. https://developers.google.com/maps/documentation/geocoding/intro.

  7. https://en.wikipedia.org/wiki/Haversine_formula.

  8. https://keras.io.

  9. http://wiki.china.org.cn/wiki/index.php/five_policies_and_measures_to_regulate_real_estate_market.

  10. http://www.sohu.com/a/131420084_651271.

References

  • Ahearne AG, Ammer J, Doyle BM, Kole LS, Martin RF (2005) House prices and monetary policy: a cross-country study. In: International finance discussion papers 841

  • Bailey MJ, Muth RF, Nourse HO (1963) A regression method for real estate price index construction. J Am Stat Assoc 58(304):933–942

    Article  Google Scholar 

  • Baral R, Li T (2017) Exploiting the roles of aspects in personalized poi recommender systems. Data Min Knowl Discov 32:320–343

    Article  MathSciNet  Google Scholar 

  • Besag J (1986) On the statistical analysis of dirty pictures. J R Stat Soc Ser B 48(3):259–302

    MathSciNet  MATH  Google Scholar 

  • Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Can A (1990) The measurement of neighborhood dynamics in urban house prices. Econ Geogr 66(3):254–272

    Article  MathSciNet  Google Scholar 

  • Case B, Pollakowski HO, Wachter SM (1991) On choosing among house price index methodologies. Real Estate Econ 19(3):286–307

    Article  Google Scholar 

  • Case B, Clapp J, Dubin R, Rodriguez M (2004) Modeling spatial and temporal house price patterns: a comparison of four models. J Real Estate Finance Econ 29(2):167–191

    Article  Google Scholar 

  • Case KE, Shiller RJ (1989) The efficiency of the market for single-family homes. Am Econ Rev 79(1):125–137

    Google Scholar 

  • Case KE, Shiller RJ et al (1987) Prices of single-family homes since 1970: new indexes for four cities. N Engl Econ Rev (Sept/Oct):45–56

  • Chopra S, Thampy T, Leahy J, Caplin A, LeCun Y (2007) Discovering the hidden structure of house prices with a non-parametric latent manifold model. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 173–182

  • De Bruyne K, Van Hove J (2013) Explaining the spatial variation in housing prices: an economic geography approach. Appl Econ 45(13):1673–1689

    Article  Google Scholar 

  • Deng D, Shahabi C, Demiryurek U, Zhu L, Yu R, Liu Y (2016) Latent space model for road networks to predict time-varying traffic. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1525–1534

  • Fu Y, Ge Y, Zheng Y, Yao Z, Liu Y, Xiong H, Yuan J (2014a) Sparse real estate ranking with online user reviews and offline moving behaviors. In: Data mining (ICDM), 2014 IEEE international conference on. IEEE, pp 120–129

  • Fu Y, Xiong H, Ge Y, Yao Z, Zheng Y, Zhou ZH (2014b) Exploiting geographic dependencies for real estate appraisal: a mutual perspective of ranking and clustering. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1047–1056

  • Fu Y, Liu G, Papadimitriou S, Xiong H, Ge Y, Zhu H, Zhu C (2015) Real estate ranking via mixed land-use latent models. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 299–308

  • Fu Y, Xiong H, Ge Y, Zheng Y, Yao Z, Zhou ZH (2016) Modeling of geographic dependencies for real estate ranking. ACM Trans Knowl Discov Data 11(1):11

    Article  Google Scholar 

  • Gelfand AE, Ecker MD, Knight JR, Sirmans C (2004) The dynamics of location in home price. J Real Estate Finance Econ 29(2):149–166

    Article  Google Scholar 

  • Goetzmann WN, Peng L (2002) The bias of the RSR estimator and the accuracy of some alternatives. Real Estate Econ 30(1):13–39

    Article  Google Scholar 

  • Goodman AC (1978) Hedonic prices, price indices and housing markets. J Urban Econ 5(4):471–484

    Article  Google Scholar 

  • Gu Z, Gu L, Eils R, Schlesner M, Brors B (2014) Circlize implements and enhances circular visualization in R. Bioinformatics 30(19):2811–2812

    Article  Google Scholar 

  • Hyndman RJ, Koehler AB (2006) Another look at measures of forecast accuracy. Int J Forecast 22(4):679–688

    Article  Google Scholar 

  • Jiang S, Ferreira J, González MC (2012) Clustering daily patterns of human activities in the city. Data Min Knowl Discov 25:478–510

    Article  MathSciNet  MATH  Google Scholar 

  • Liu B, Mavrin B, Niu D, Kong L (2016) House price modeling over heterogeneous regions with hierarchical spatial functional analysis. In: Data mining (ICDM), 2016 IEEE 16th international conference on. IEEE, pp 1047–1052

  • Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Physica A 390(6):1150–1170

    Article  Google Scholar 

  • Meese R, Wallace N (1991) Nonparametric estimation of dynamic hedonic price models and the construction of residential housing price indices. Real Estate Econ 19(3):308–332

    Article  Google Scholar 

  • Nagaraja CH, Brown LD, Zhao LH (2011) An autoregressive approach to house price modeling. Ann Appl Stat 5(1):124–149

    Article  MathSciNet  MATH  Google Scholar 

  • Pace RK, Barry R, Clapp JM, Rodriquez M (1998) Spatiotemporal autoregressive models of neighborhood effects. J Real Estate Finance Econ 17(1):15–33

    Article  Google Scholar 

  • Pace RK, Barry R, Gilley OW, Sirmans C (2000) A method for spatial-temporal forecasting with an application to real estate prices. Int J Forecast 16(2):229–246

    Article  Google Scholar 

  • Peterson S, Flanagan A (2009) Neural network hedonic pricing models in mass real estate appraisal. J Real Estate Res 31(2):147–164

    Google Scholar 

  • Sangalli LM, Ramsay JO, Ramsay TO (2013) Spatial spline regression models. J R Stat Soc Ser B (Stat Methodol) 75(4):681–703

    Article  MathSciNet  Google Scholar 

  • Shiller RJ (1991) Arithmetic repeat sales price estimators. J Hous Econ 1(1):110–126

    Article  Google Scholar 

  • Smith TE, Wu P (2009) A spatio-temporal model of housing prices based on individual sales transactions over time. J Geogr Syst 11(4):333

    Article  Google Scholar 

  • Tan F, Xia Y, Zhu B (2014) Link prediction in complex networks: a mutual information perspective. PLOS ONE 9(9):e107,056

    Article  Google Scholar 

  • Tan F, Cheng C, Wei Z (2016) Modeling real estate for school district identification. In: Data mining (ICDM), 2016 IEEE 16th international conference on. IEEE, pp 1227–1232

  • Tan F, Cheng C, Wei Z (2017) Time-aware latent hierarchical model for predicting house prices. In: Data mining (ICDM), 2017 IEEE 16th international conference on. IEEE, pp 1111–1116

  • Tan F, Du K, Wei Z, Liu H, Qin C, Zhu R (2018) Modeling item-specific effects for video click. In: Proceedings of the 2018 SIAM international conference on data mining. SIAM, pp 639–647

  • Taylor LO (2003) The hedonic method. In: A primer on nonmarket valuation, pp 331–393

  • Yao Z, Fu Y, Liu B, Xiong H (2016) The impact of community safety on house ranking. In: Proceedings of the 2016 SIAM international conference on data mining. SIAM, pp 459–467

  • Zhou J, Wang F, Hu J, Ye J (2014) From micro to macro: data driven phenotyping by densification of longitudinal electronic medical records. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 135–144

  • Zhu H, Xiong H, Tang F, Liu Q, Ge Y, Chen E, Fu Y (2016) Days on market: Measuring liquidity in real estate markets. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 393–402

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhi Wei.

Additional information

Responsible editor: Fei Wang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tan, F., Cheng, C. & Wei, Z. Modeling and elucidation of housing price. Data Min Knowl Disc 33, 636–662 (2019). https://doi.org/10.1007/s10618-018-00612-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-018-00612-0

Keywords

Navigation