Abstract
Predictions of apartments prices in New York City (NYC) have always been of interest to new homeowners, investors, Wall Street funds managers, and inhabitants of the city. In recent years, average prices have risen to the highest ever recorded rebounding after the 2008 economic recession. Although prices are trending up, not all apartments are. Different regions of the city have appreciated differently over time; knowing where to buy or sell is essential for all stakeholders. In this project, we propose a predictive analytics framework that analyzes new alternative data sources to extract predictive features of the NYC real estate market. Our experiments indicated that restaurant health inspection data and crime statistics can help predict apartments prices in NYC. The framework we introduce in this work uses an artificial recurrent neural network with Long Short-Term Memory (LSTM) units and incorporates the two latter predictive features to predict future prices of apartments. Empirical results show that feeding predictive features from (1) restaurant inspections data and (2) crime statistics to a neural network with LSTM units results in smaller errors than the traditional Autoregressive Integrated Moving Average (ARIMA) model, which is normally used for this type of regression. Predictive analytics based on non-linear models with features from alternative data sources can capture hidden relationships that linear models are not able not discover. The framework presented in this study has the potential to serve as a supplement to the traditional forecasting tools of real estate markets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anderson, T.W.: The Statistical Analysis of Time Series. Wiley, Hoboken (2011)
Bari, A., Chaouchi, M., Jung, T.: Predictive Analytics for Dummies. Wiley, Hoboken (2016)
Bari, A., Liu, L.: Probing the wisdom of apple, inc., crowds using alternative data sources (2017). https://insidebigdata.com/2017/10/12/probing-wisdom-apple-inc-crowds-using-alternative-data-sources/
Bari, A., Peidaee, P., Khera, A., Zhu, J., Chen, H.: Predicting financial markets using the wisdom of crowds. In: 2019 IEEE 4th International Conference on Big Data Analytics (ICBDA), pp. 334–340. IEEE (2019)
Bari, A., Saatcioglu, G.: Emotion artificial intelligence derived from ensemble learning. In: 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), pp. 1763–1770. IEEE (2018)
Abdelghani Bellaachia and Anasse Bari. SFLOSCAN: a biologically-inspired data mining framework for community identification in dynamic social networks. In: 2011 IEEE Symposium on Swarm Intelligence, pp. 1–8. IEEE (2011)
Bellaachia, A., Bari, A.: Flock by leader: a novel machine learning biologically inspired clustering algorithm. In: Tan, Y., Shi, Y., Ji, Z. (eds.) ICSI 2012. LNCS, vol. 7332, pp. 117–126. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31020-1_15
Bellaachia, A., Bari, A.: A flocking based data mining algorithm for detecting outliers in cancer gene expression microarray data. In: 2012 International Conference on Information Retrieval & Knowledge Management, pp. 305–311. IEEE (2012)
Brockwell, P.J., Davis, R.A.: Introduction to Time Series and Forecasting. Springer Texts in Statistics. Springer, Berlin (2016)
Campbell, C.: New York City open data: a brief history, 08 March 2017. https://datasmart.ash.harvard.edu/news/article/new-york-city-open-data-a-brief-history-991
The furman center for real estate & urban policy. Trends in New York City housing price appreciation (2008). https://furmancenter.org/files/Trends_in_NYC_Housing_Price_Appreciation.pdf
Google. Google Geocoding API. https://developers.google.com/maps/documentation/geocoding/start
Granger, C.W.J.: Causality, cointegration, and control. J. Econ. Dyn. Control 12(2–3), 551–559 (1988)
Greene, W.H.: 1951 Econometric Analysis. Pearson Prentice Hall, Upper Saddle River (2012)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
New York City department of mental health and mental hygiene. NYC restaurant inspection results. https://data.cityofnewyork.us/Health/DOHMH-New-York-City-Restaurant-Inspection-Results/43nn-pn8j
New York Police department. Citywide crime statistics (2018). https://www1.nyc.gov/site/nypd/stats/crime-statistics/citywide-crime-stats.page
Nonko, E.: Manhattan home prices have increased dramatically in a decade, 02 February 2017. https://ny.curbed.com/2017/2/2/14483418/manhattan-home-sales-market-reports
NYC Department of Finance. Rolling sales data (2018). https://www1.nyc.gov/site/finance/taxes/property-rolling-sales-data.page
Bureau of Labor Statistics. U.S. Department of Labor. Consumer Expenditures - 2017 (2018). https://www.bls.gov/news.release/cesan.nr0.htm
Shumway, R.H., Stoffer, D.S.: Time Series Analysis and Its Applications: with R Examples. Springer, Berlin (2017)
Acknowledgments
We would like thank Jing Wang, who contributed with helpful discussions to the initial analysis of this work. Also, the NYU High Performance Computing team, especially Shenglong Wang, who was always available to help with technical issues in the computer cluster.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Moraes, R.M., Bari, A., Zhu, J. (2019). Restaurant Health Inspections and Crime Statistics Predict the Real Estate Market in New York City. In: Nicosia, G., Pardalos, P., Umeton, R., Giuffrida, G., Sciacca, V. (eds) Machine Learning, Optimization, and Data Science. LOD 2019. Lecture Notes in Computer Science(), vol 11943. Springer, Cham. https://doi.org/10.1007/978-3-030-37599-7_45
Download citation
DOI: https://doi.org/10.1007/978-3-030-37599-7_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37598-0
Online ISBN: 978-3-030-37599-7
eBook Packages: Computer ScienceComputer Science (R0)