Abstract
Air pollution is one of the most serious hazards to humans' health nowadays, it is an invisible killer that takes many human lives every year. There are many pollutants existing in the atmosphere today, ozone being one of the most threatening pollutants. It can cause serious health damage such as wheezing, asthma, inflammation, and early mortality rates. Although air pollution could be forecasted using chemical and physical models, machine learning techniques showed promising results in this area, especially artificial neural networks. Despite its importance, there has not been any research on predicting ground-level ozone in Jordan. In this paper, we build a model for predicting ozone concentration for the next day in Amman, Jordan using a mixture of meteorological and seasonal variables of the previous day. We compare a multi-layer perceptron neural network (MLP), support vector regression (SVR), decision tree regression (DTR), and extreme gradient boosting (XGBoost) algorithms. We also explore the effect of applying various smoothing filters on the time-series data such as moving average, Holt-Winters smoothing and Savitzky-Golay filters. We find that MLP outperformed the other algorithms and that using Savitzky-Golay improved the results by 50% for coefficient of determination (R2) and 80% for root mean square error (RMSE) and mean absolute error (MAE). Another point we focus on is the variables required to predict ozone concentration. In order to reduce the time required for prediction, we perform feature selection which greatly reduces the time by 91% as well as shrinking the number of features required for prediction to the previous day values of ozone, humidity, and temperature. The final model scored 98.653% for R2, 1.016 ppb for RMSE and 0.800 ppb for MAE.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
H. P. Peng. Air Quality Prediction by Machine Learning Methods, Master dissertation, The University of British Columbia, Canada, 2015.
United States Environmental Protection Agency. Environments and contaminants: Criteria air pollutants. America’s Children and the Environment, 3rd ed., United States Environmental Protection Agency, Ed., Washington DC, USA: United States Environmental Protection Agency, 2015.
DEFRA. Air Pollution: Action in a Changing Climate, London, UK: Department for Environment, Food and Rural Affairs, 2010.
A. Plaia, M. Ruggieri. Air quality indices: A review. Reviews in Environmental Science and Bio/Technology, vol. 10, no. 2, pp. 165–179, 2011. DOI: 10.1007/s11157-010-9227-2.
C. Bellinger, M. Shazan, M. Jabbar, O. Zaïane, A. Osornio-Vargas. A systematic review of data mining and machine learning for air pollution epidemiology. BMC Public Health, vol. 17, no. 1, Article number 907, 2017. DOI: 10.1186/s12889-017-4914-3.
T. M. Chiwewe, J. Ditsela. Machine learning based estimation of Ozone using spatio-temporal data from air quality monitoring stations. In Proceedings of the 14th IEEE International Conference on Industrial Informatics, IEEE, Poitiers, France, pp. 58–63, 2016. DOI: 10.1109/INDIN.2016.7819134.
S. A. Abdul-Wahab, S. M. Al-Alawi. Assessment and prediction of tropospheric ozone concentration levels using artificial neural networks. Environmental Modelling & Software, vol. 17, no. 3, pp. 219–228, 2002. DOI: 10.1016/S1364-8152(01)00077-9.
W. Z. Lu, D. Wang. Learning machines: Rationale and application in ground-level ozone prediction. Applied Soft Computing, vol. 24, pp. 135–141, 2014. DOI: 10.1016/j.asoc.2014.07.008.
A. S. Sánchez, P. J. G. Nieto, P. R. Fernández, J. J. del Coz Díaz, F. J. Iglesias-Rodríguez. Application of an SVM-based regression model to the air quality study at local scale in the Avilés urban area (Spain). Mathematical and Computer Modelling, vol. 54, no. 5–6, pp. 1453–1466, 2011. DOI: 10.1016/j.mcm.2011.04.017.
A. J. Smola, B. Schölkopf. A tutorial on support vector regression. Statistics and Computing, vol. 14, no. 3, pp. 199–222, 2004. DOI: 10.1023/B:STCO.0000035301.49549.88.
G. G. Moisen. Classification and regression trees. Encyclopedia of Ecology, S. E. Jørgensen, B. D. Fath, Eds., Oxford, UK: Elsevier, 2008.
B. X. Zhai, J. G. Chen. Development of a stacked ensemble model for forecasting and analyzing daily average PM2.5 concentrations in Beijing, China. Science of the Total Environment, vol. 635, pp. 644–658, 2018. DOI:10.1016/j.scitotenv.2018.04.040.
M. R. Delavar, A. Gholami, G. R. Shiran, Y. Rashidi, G. R. Nakhaeizadeh, K. Fedra, S. Hatefi Afshar. A novel method for improving air pollution prediction based on machine learning approaches: A case study applied to the capital city of tehran. ISPRS International Journal of Geo-Information, vol. 8, no. 2, Article number 99, 2019. DOI: 10.3390/ijgi8020099.
S. P. Mishra, P. K. Dash. Short term wind speed prediction using multiple kernel pseudo inverse neural network. International Journal of Automation and Computing, vol. 15, no. 1, pp. 66–83, 2018. DOI: 10.1007/s11633-017-1086-7.
S. R. Devi, P. Arulmozhivarman, C. Venkatesh, P. Agarwal. Performance comparison of artificial neural network models for daily rainfall prediction. International Journal of Automation and Computing, vol. 13, no. 5, pp. 417–427, 2016. DOI: 10.1007/s11633-016-0986-2.
S. Haykin. Neural Networks: A Comprehensive Foundation, 3rd ed., Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 2007.
V. K. Ha, J. C. Ren, X. Y. Xu, S. Zhao, G. Xie, V. Masero, A. Hussain. Deep learning based single image super-resolution: A survey. International Journal of Automation and Computing, vol. 16, no. 4, pp. 413–426, 2019. DOI: 10. 1007/s11633-019-1183-x.
V. R. Prybutok, J. Yi, D. Mitchell. Comparison of neural network models with ARIMA and regression models for prediction of Houston's daily maximum ozone concentrations. European Journal of Operational Research, vol. 122, no. 1, pp. 31–40, 2000. DOI: 10.1016/S0377-2217(99)00069-7.
H. Faris, M. Alkasassbeh, A. Rodan. Artificial neural networks for surface ozone prediction: Models and analysis. Polish Journal of Environmental Studies, vol. 23, no. 2, pp. 341–348, 2014.
A. Sheta, H. Faris, A. Rodan, E. Kovač-Andrić, A. M. Al-Zoubi. Cycle reservoir with regular jumps for forecasting ozone concentrations: Two real cases from the east of Croatia. Air Quality, Atmosphere & Health, vol. 11, no. 5, pp. 559–569, 2018. DOI: 10.1007/s11869-018-0561-9.
N. Kumar, A. Middey, P. S. Rao. Prediction and examination of seasonal variation of ozone with meteorological parameter through artificial neural network at NEERI, Nagpur, India. Urban Climate, vol. 20, pp. 148–167, 2017. DOI: 10.1016/j.uclim.2017.04.003.
C. Paoli, G. Notton, M. L. Nivet, M. Padovani, J. L. Savelli. A neural network model forecasting for prediction of hourly ozone concentration in corsica. In Proceedings of the 10th International Conference on Environment and Electrical Engineering, IEEE, Rome, Italy, 2011. DOI: 10.1109/EEEIC.2011.5874661.
X. Li, L. Peng, Y. Hu, J. Shao, T. H. Chi. Deep learning architecture for air quality predictions. Environmental Science and Pollution Research, vol. 23, no. 22, pp. 22408–22417, 2016. DOI: 10.1007/s11356-016-7812-9.
T. X. Zhang, J. Y. Su, C. J. Liu, W. H. Chen. Potential bands of sentinel-2A satellite for classification problems in precision agriculture. International Journal of Automation and Computing, vol. 16, no. 1, pp. 16–26, 2019. DOI: 10.1007/s11633-018-1143-x.
Z. D. Tian, X. W. Gao, K. Li. A hybrid time-delay prediction method for networked control system. International Journal of Automation and Computing, vol. 11, no. 1, pp. 19–24, 2014. DOI: 10.1007/s11633-014-0761-1.
W. J. Wang, C. Q. Men, W. Z. Lu. Online prediction model based on support vector machine. Neurocomputing, vol. 71, no. 4–6, pp. 550–558, 2008. DOI: 10.1016/j.neucom.2007.07.020.
B. C. Liu, A. Binaykia, P. C. Chang, M. K. Tiwari, C. C. Tsao. Urban air quality forecasting based on multi-dimensional collaborative Support Vector Regression (SVR): A case study of Beijing-Tianjin-Shijiazhuang. PLoS One, vol. 12, no. 7, Article number e0179763, 2017. DOI: 10.1371/journal.pone.0179763.
M. S. Tehrany, B. Pradhan, M. N. Jebur. Spatial prediction of flood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS. Journal of Hydrology, vol. 504, pp. 69–79, 2013. DOI: 10.1016/j.jhydrol.2013.09.034.
B. Y. Pan. Application of XGBoost algorithm in hourly PM2.5 concentration prediction. In IOP Conference Series: Earth and Environmental Science, vol. 113, Article number. 012127, 2018. DOI: 10.1088/1755-1315/113/1/012127.
M. Z. Joharestani, C. X. Cao, X. L. Ni, B. Bashir, S. Tale-biesfandarani. PM2.5 prediction based on random forest, XGBoost, and deep learning using multisource remote sensing data. Atmosphere, vol. 10, no. 7, Article number 373, 2019. DOI: 10.3390/atmos10070373.
Y. Rybarczyk, R. Zalakeviciute. Machine learning approaches for outdoor air quality modelling: A systematic review. Applied Sciences, vol. 8, no. 12, Article number 2570, 2018. DOI: 10.3390/app8122570.
R. B. Potter, K. Darmame, N. Barham, S. Nortcliff. An Introduction to the Urban Geography of Amman, Jordan. Geographical Paper No. 182, The University of Reading, UK, 2007.
Stamen and OpenStreetMap. Stamen Maps, [Online], Available: http://maps.stamen.com/toner/#6/31.588/35.552, February 20, 2020.
Jordanian Ministry of Environment. Daily Pollution Concentrations in King Al-Hussein Public Parks Station Data-set, [Online], Available: http://moenv.gov.jo/EN/Pages/mainpage.aspx, 2019.
R. M. Alrumaih, M. A. Al-Fawzan. Time series forecasting using wavelet denoising an application to saudi stock index. Journal of King Saud University - Engineering Sciences, vol. 14, no. 2, pp. 221–233, 2002. DOI: 10.1016/S1018-3639(18)30755-4.
A. M. De Livera, R. J. Hyndman, R. D. Snyder. Forecasting time series with complex seasonal patterns using exponential smoothing. Journal of the American Statistical Association, vol. 106, no. 496, pp. 1513–1527, 2011. DOI: 10.1198/jasa.2011.tm09771.
S. W. Smith. The Scientist and Engineer's Guide to Digital Signal Processing, 2nd ed., San Diego, USA: California Technical Publishing, 1999.
R. W. Schafer. What is a savitzky-golay filter? IEEE Signal Processing Magazine, vol. 28, no. 4, pp. 111–117, 2011. DOI: 10.1109/MSP.2011.941097.
S. B. Ashrafi, M. Anemangely, M. Sabah, M. J. Ameri. Application of hybrid artificial neural networks for predicting rate of penetration (ROP): A case study from Marun oil field. Journal of Petroleum Science and Engineering, vol. 175, pp. 604–623, 2019. DOI: 10.1016/j.petrol.2018.12.013.
M. Anemangely, A. Ramezanzadeh, B. Tokhmechi, A. Molaghab, A. Mohammadian. Drilling rate prediction from petrophysical logs and mud logging data using an optimized multilayer perceptron neural network. Journal of Geophysics and Engineering, vol. 15, no. 4, pp. 1146–1159, 2018. DOI: 10.1088/1742-2140/aaac5d.
M. Sabah, M. Talebkeikhah, D. A. Wood, R. Khosravanian, M. Anemangely, A. Younesi. A machine learning approach to predict drilling rate using petrophysical and mud logging data. Earth Science Informatics, vol. 12, no. 3, pp. 319–339, 2019. DOI: 10.1007/s12145-019-00381-4.
M. Anemangely, A. Ramezanzadeh, M. M. Behboud. Geo-mechanical parameter estimation from mechanical specific energy using artificial intelligence. Journal of Petroleum Science and Engineering, vol. 175, pp. 407–429, 2019. DOI: 10.1016/j.petrol.2018.12.054.
C. J. Willmott. Some comments on the evaluation of model performance. Bulletin of the American Meteorological Society, vol. 63, no. 11, pp. 1309–1313, 1982.
Acknowledgements
The authors are grateful to the Applied Science Private University, Amman, Jordan, for the financial support granted to this research. The authors would also like to thank the Jordanian Ministry of Environment for giving them access to perform scientific research on the King Al-Hussein Public Parks air pollution and meteorological data collected by the ministry.
Author information
Authors and Affiliations
Corresponding author
Additional information
Recommended by Associate Editor Paul Stewart
Maryam Aljanabi received the B. Eng. degree in computer engineering from Omar Almukhtar University, Libya in 2017, and the M. Sc. degree in computer science from the Applied Science Private University, Jordan in 2020.
Her research interests include machine learning and its applications, artificial intelligence, environmental science, and data science.
Mohammad Shkoukani received the B. Sc. degree from Applied Science Private University, Jordan in 2002, and M. Sc. degree from Arab Academy for Banking and Financial Sciences, Jordan in 2004, both in computer. He received the Ph. D. degree in computer information systems from Arab Academy for Banking and Financial Sciences, Jordan in 2009. He is an associate professor at Applied Science Private University, Jordan.
His research interests include agent oriented software engineering, information systems security, and machine learning.
Mohammad Hijjawi received the Ph. D. degree from Manchester Metropolitan University, UK in 2011. He is an associate professor in Computer Science Department, Faculty of Information Technology, Applied Science Private University, Jordan. He has previous computing based training in several domains. He also acts as the Faculty of Information Technology Dean at Applied Science Private University, Jordan in 2015.
His research interests include natural language processing and machine learning.
Rights and permissions
About this article
Cite this article
Aljanabi, M., Shkoukani, M. & Hijjawi, M. Ground-level Ozone Prediction Using Machine Learning Techniques: A Case Study in Amman, Jordan. Int. J. Autom. Comput. 17, 667–677 (2020). https://doi.org/10.1007/s11633-020-1233-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11633-020-1233-4