Abstract
Having accurate spatial prediction models of air pollutant concentrations can be very helpful to alleviate the shortage of monitoring stations, specially in low-to-middle income countries. However, given the large diversity of model types, both statistical, numerical and machine learning (ML) based, it is not clear which of them are most suitable for this task. In this paper we study the predictive capabilities of common machine learning methods for the spatial prediction of PM\(_{2.5}\) concentration level. Three relevant factors were scrutinized: the extent to which meteorological variables impact the prediction performance; the effect of variable normalization by inverse distance weighting (IDW); and the number of neighborhood stations needed to maximize predictive performance. Results in a dataset from Beijing monitoring network show that simple models like Linear Regresors trained on IDW normalized variables can cope with this task. Some knowledge have been derived to guide the construction of competent models for spatial prediction of PM\(_{2.5}\) concentrations with ML-based methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Lag is expressed in units of time (ex: hours) and corresponds to the amount of historical data that we allow the model to be used for prediction.
- 2.
- 3.
- 4.
- 5.
Statistic that determines the quality of the model to replicate the results, and the proportion of variation of the results that can be explained by the model [14].
References
Baumann, L.M., et al.: Effects of distance from a heavily transited avenue on asthma and atopy in a periurban shantytown in Lima, Peru. J. Aller. Clin. Immunol. 127(4), 875–882 (2011)
Bellinger, C., Jabbar, M.S.M., Zaïane, O., Osornio-Vargas, A.: A systematic review of data mining and machine learning for air pollution epidemiology. BMC Public Health 17(1), 907 (2017)
Liu, B.C., Binaykia, A., Chang, P.C., Tiwari, M.K., Tsao, C.C.: Urban air quality forecasting based on multi-dimensional collaborative support vector regression (SVR): a case study of Beijing-Tianjin-Shijiazhuang. PloS One 12(7), 1–17 (2017)
Li, X., et al.: Long short-term memory neural network for air pollutant concentration predictions: method development and evaluation. Environ. Pollut. 231, 997–1004 (2017)
Xu, Y., Yang, W., Wang, J.: Air quality early-warning system for cities in China. Atmos. Environ. 148, 239–257 (2017)
Freeman, B.S., Taylor, G., Gharabaghi, B., Thé, J.: Forecasting air quality time series using deep learning. J. Air Waste Manage. Assoc. 68, 1–21 (2018). 1982, p. 301
Reátegui-Romero, W., Sánchez-Ccoyllo, O.R., de Fatima Andrade, M., Moya-Alvarez, A.: PM2.5 Estimation with the WRF/Chem model, produced by vehicular flow in the Lima metropolitan area. Open J. Air Pollut. 7(03), 215 (2018)
Sánchez-Ccoyllo, O.R., et al.: Modeling study of the particulate matter in Lima with the WRF-Chem model: case study of April 2016. Int. J. Appl. Eng. Res. 13(11), 10129–10141 (2018)
Soh, P.W., Chang, J.W., Huang, J.W.: Adaptive deep learning-based air quality prediction model using the most relevant spatial-temporal relations. IEEE Access 6, 38186–38199 (2018)
Wang, J., Song, G.: A deep spatial-temporal ensemble model for air quality prediction. Neurocomputing 314, 198–206 (2018)
Wen, C., Liu, S., Yao, X., Peng, L., Li, X., Hu, Y., Chi, T.: A novel spatiotemporal convolutional long short-term neural network for air pollution prediction. Sci. Total Environ. 654, 1091–1099 (2019)
Rakthanmanon, T., et al.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 262–270. ACM, August, 2012
Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7(3), 358–386 (2004). https://doi.org/10.1007/s10115-004-0154-9
Steel, R.G., Torrie, J.H.: Principles and Procedures of Statistics. McGraw-Hill Book Company Inc., New York (1960)
Shepard, D.: A two-dimensional interpolation function for irregularly-spaced data. In: Proceedings of the 1968 23rd ACM International Conference, pp. 517–524. ACM, January 1968
OMS. Nueve de cada diez personas de todo el mundo respiran aire contaminado. Recuperado de (2018). https://www.who.int/es/news-room/detail/02-05-2018-9-out-of-10-people-worldwide-breathe-polluted-air-but-more-countries-are-taking-action
Unidas, N.: La Agenda 2030 y los Objetivos de Desarrollo Sostenible: una oportunidad para América Latina y el Caribe (LC/G.2681-P/Rev. 3), Santiago (2018)
Xing, Y.F., Xu, Y.H., Shi, M.H., Lian, Y.X.: The impact of PM2.5 on the human respiratory system. J. Thorac. Dis. 8(1), 69 (2016)
Acknowledgment
The authors gratefully acknowledge financial support by Fondo Nacional de Desarrollo CientÃfico, Tecnológico y de Innovación Tecnológica (Fondecyt) - Mundial Bank (Grant: 50-2018-FONDECYT-BM-IADT-MU).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Vargas-Campos, I.R., Villanueva, E. (2021). Comparative Study of Spatial Prediction Models for Estimating PM\(_{2.5}\) Concentration Level in Urban Areas. In: Lossio-Ventura, J.A., Valverde-Rebaza, J.C., DÃaz, E., Alatrista-Salas, H. (eds) Information Management and Big Data. SIMBig 2020. Communications in Computer and Information Science, vol 1410. Springer, Cham. https://doi.org/10.1007/978-3-030-76228-5_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-76228-5_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-76227-8
Online ISBN: 978-3-030-76228-5
eBook Packages: Computer ScienceComputer Science (R0)