Skip to main content

Restaurant Health Inspections and Crime Statistics Predict the Real Estate Market in New York City

  • Conference paper
  • First Online:
Book cover Machine Learning, Optimization, and Data Science (LOD 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11943))

Abstract

Predictions of apartments prices in New York City (NYC) have always been of interest to new homeowners, investors, Wall Street funds managers, and inhabitants of the city. In recent years, average prices have risen to the highest ever recorded rebounding after the 2008 economic recession. Although prices are trending up, not all apartments are. Different regions of the city have appreciated differently over time; knowing where to buy or sell is essential for all stakeholders. In this project, we propose a predictive analytics framework that analyzes new alternative data sources to extract predictive features of the NYC real estate market. Our experiments indicated that restaurant health inspection data and crime statistics can help predict apartments prices in NYC. The framework we introduce in this work uses an artificial recurrent neural network with Long Short-Term Memory (LSTM) units and incorporates the two latter predictive features to predict future prices of apartments. Empirical results show that feeding predictive features from (1) restaurant inspections data and (2) crime statistics to a neural network with LSTM units results in smaller errors than the traditional Autoregressive Integrated Moving Average (ARIMA) model, which is normally used for this type of regression. Predictive analytics based on non-linear models with features from alternative data sources can capture hidden relationships that linear models are not able not discover. The framework presented in this study has the potential to serve as a supplement to the traditional forecasting tools of real estate markets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Anderson, T.W.: The Statistical Analysis of Time Series. Wiley, Hoboken (2011)

    Google Scholar 

  2. Bari, A., Chaouchi, M., Jung, T.: Predictive Analytics for Dummies. Wiley, Hoboken (2016)

    Google Scholar 

  3. Bari, A., Liu, L.: Probing the wisdom of apple, inc., crowds using alternative data sources (2017). https://insidebigdata.com/2017/10/12/probing-wisdom-apple-inc-crowds-using-alternative-data-sources/

  4. Bari, A., Peidaee, P., Khera, A., Zhu, J., Chen, H.: Predicting financial markets using the wisdom of crowds. In: 2019 IEEE 4th International Conference on Big Data Analytics (ICBDA), pp. 334–340. IEEE (2019)

    Google Scholar 

  5. Bari, A., Saatcioglu, G.: Emotion artificial intelligence derived from ensemble learning. In: 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), pp. 1763–1770. IEEE (2018)

    Google Scholar 

  6. Abdelghani Bellaachia and Anasse Bari. SFLOSCAN: a biologically-inspired data mining framework for community identification in dynamic social networks. In: 2011 IEEE Symposium on Swarm Intelligence, pp. 1–8. IEEE (2011)

    Google Scholar 

  7. Bellaachia, A., Bari, A.: Flock by leader: a novel machine learning biologically inspired clustering algorithm. In: Tan, Y., Shi, Y., Ji, Z. (eds.) ICSI 2012. LNCS, vol. 7332, pp. 117–126. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31020-1_15

    Chapter  Google Scholar 

  8. Bellaachia, A., Bari, A.: A flocking based data mining algorithm for detecting outliers in cancer gene expression microarray data. In: 2012 International Conference on Information Retrieval & Knowledge Management, pp. 305–311. IEEE (2012)

    Google Scholar 

  9. Brockwell, P.J., Davis, R.A.: Introduction to Time Series and Forecasting. Springer Texts in Statistics. Springer, Berlin (2016)

    Book  Google Scholar 

  10. Campbell, C.: New York City open data: a brief history, 08 March 2017. https://datasmart.ash.harvard.edu/news/article/new-york-city-open-data-a-brief-history-991

  11. The furman center for real estate & urban policy. Trends in New York City housing price appreciation (2008). https://furmancenter.org/files/Trends_in_NYC_Housing_Price_Appreciation.pdf

  12. Google. Google Geocoding API. https://developers.google.com/maps/documentation/geocoding/start

  13. Granger, C.W.J.: Causality, cointegration, and control. J. Econ. Dyn. Control 12(2–3), 551–559 (1988)

    Article  Google Scholar 

  14. Greene, W.H.: 1951 Econometric Analysis. Pearson Prentice Hall, Upper Saddle River (2012)

    Google Scholar 

  15. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  16. New York City department of mental health and mental hygiene. NYC restaurant inspection results. https://data.cityofnewyork.us/Health/DOHMH-New-York-City-Restaurant-Inspection-Results/43nn-pn8j

  17. New York Police department. Citywide crime statistics (2018). https://www1.nyc.gov/site/nypd/stats/crime-statistics/citywide-crime-stats.page

  18. Nonko, E.: Manhattan home prices have increased dramatically in a decade, 02 February 2017. https://ny.curbed.com/2017/2/2/14483418/manhattan-home-sales-market-reports

  19. NYC Department of Finance. Rolling sales data (2018). https://www1.nyc.gov/site/finance/taxes/property-rolling-sales-data.page

  20. Bureau of Labor Statistics. U.S. Department of Labor. Consumer Expenditures - 2017 (2018). https://www.bls.gov/news.release/cesan.nr0.htm

  21. Shumway, R.H., Stoffer, D.S.: Time Series Analysis and Its Applications: with R Examples. Springer, Berlin (2017)

    Book  Google Scholar 

Download references

Acknowledgments

We would like thank Jing Wang, who contributed with helpful discussions to the initial analysis of this work. Also, the NYU High Performance Computing team, especially Shenglong Wang, who was always available to help with technical issues in the computer cluster.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rafael M. Moraes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Moraes, R.M., Bari, A., Zhu, J. (2019). Restaurant Health Inspections and Crime Statistics Predict the Real Estate Market in New York City. In: Nicosia, G., Pardalos, P., Umeton, R., Giuffrida, G., Sciacca, V. (eds) Machine Learning, Optimization, and Data Science. LOD 2019. Lecture Notes in Computer Science(), vol 11943. Springer, Cham. https://doi.org/10.1007/978-3-030-37599-7_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-37599-7_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-37598-0

  • Online ISBN: 978-3-030-37599-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics