Abstract
Forecasting time series has acquired immense research importance and has vast applications in the area of air pollution monitoring. This work attempts to investigate the abilities of various existing techniques when applied for short term, high granular time series forecasting of PM2.5. More specifically, a comparative study has been provided, taking into account both popularly used models and lesser-used models in this area. The study has been carried out considering ten well defined models that are ARIMA (auto-regressive integrated moving average), SARIMA (seasonal ARIMA), SES (single exponential smoothing), DES (double exponential smoothing), TES (triple exponential smoothing), ANN (artificial neural network), DT (decision tree), kNN (k-nearest neighbor), LSTM (long short-term memory) and MCFO (markov chain first order). A framework has been built that categories the models, implements them under identical execution environment and forecasts succeeding values. Implementation has been carried out over five data sets of real-world air pollution time series, that are collected from five differently located government setup monitoring stations over a period of 1 year (July 2018-June 2019). Rigorous statistical analysis has been performed that yields an insight to the nature and variability of these time series data. Forecasting has been carried out on short term basis, focusing on high granularity whereas, three different lengths of forecast horizon (1 day, 1 week, and 1 month) have been tested. Eventually, the models have been compared in terms of their associated performance measuring units namely, RMSE (root mean of squared error), MAE (mean absolute error) and MAPE (mean absolute percentage error). The comparative results verified with multiple datasets show that all the models posses less error for a shorter forecast horizon, where LSTM providing the best performance. Superiority of machine learning and deep learning models are found in case of longer length of forecast horizon with kNN achieving best accuracy whereas, significant performance degradation of ARIMA is found for longer forecast horizon. Moreover, TES, DT, kNN, LSTM, MCFO are found to be well adopted in relation with shape and variability of the data. Note that the performance on various length of high granular forecast horizon have been studied over multiple datasets that give an added value to this work.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Al-Qahtani FH, Crone SF (2013, August). Multivariate k-nearest neighbour regression for time series data–A novel algorithm for forecasting UK electricity demand. In The 2013 international joint conference on neural networks (IJCNN), IEEE. pp.1-8
Amato A, Calabrese M, Di Lecce V (2008, May) Decision trees in time series reconstruction problems. In 2008 IEEE Instrumentation and Measurement Technology Conference, IEEE. pp.895–899
Bose R, Dey RK, Roy S, Sarddar D (2020) Time series forecasting using double exponential smoothing for predicting the major ambient air pollutants. Information and communication technology for sustainable development. Springer, Singapore, pp 603–613
Breiman L, Friedman JH, Olshen R, Stone CJ (1984) Classification and regression trees. Wadsworth & Brooks/Cole Advanced Books & Software, Pacific California
Brunelli U, Piazza V, Pignato L, Sorbello F, Vitabile S (2007) Two-days ahead prediction of daily maximum concentrations of \(\text{ SO}_{2}\), \(\text{ O}_{3}\), \(\text{ PM}_{10}\), \(\text{ NO}_{2}\), CO in the urban area of Palermo, Italy. Atmos Environ 41(14):2967–2995
Caillault ÉP, Bigand A (2018, September). Comparative study on Univariate forecasting methods for Meteorological time series. In 2018 26th European Signal Processing Conference (EUSIPCO),IEEE, pp.2380-2384
Chelani AB (2005) Predicting chaotic time series of PM10 concentration using artificial neural network. Int J Environ Stud 62(2):181–191
Choubin B, Zehtabian G, Azareh A, Rafiei-Sardooi E, Sajedi-Hosseini F, Kişi Ö (2018) Precipitation forecasting using classification and regression trees (CART) model: a comparative study of different approaches. Environ Earth Sci 77(8):314
Chu HJ, Lin CY, Liau CJ, Kuo YM (2012) Identifying controlling factors of ground-level ozone levels over southwestern Taiwan using a decision tree. Atmos Environment 60:142–152
Das M, Ghosh SK (2018) Data-driven approaches for meteorological time series prediction: a comparative study of the state-of-the-art computational intelligence techniques. Pattern Recognit Lett 105:155–164
Dastorani M, Mirzavand M, Dastorani MT, Sadatinejad SJ (2016) Comparative study among different time series models applied to monthly rainfall forecasting in semi-arid climate condition. Nat Hazards 81(3):1811–1827
Díaz-Robles LA, Ortega JC, Fu JS, Reed GD, Chow JC, Watson JG, Moncada-Herrera JA (2008) A hybrid ARIMA and artificial neural networks model to forecast particulate matter in urban areas: the case of Temuco, Chile. Atmos Environ 42(35):8331–8340
Divina F, García Torres M, Goméz Vela FA, Vázquez Noguera JL (2019) A comparative study of time series forecasting methods for short term electric energy consumption prediction in smart buildings. Energies 12(10):1934
Domańska D, Wojtylak M (2012) Application of fuzzy time series models for forecasting pollution concentrations. Expert Syst Appl 39(9):7673–7679
Dong M, Yang D, Kuang Y, He D, Erdal S, Kenski D (2009) \(\text{ PM}_{2.5}\) concentration prediction using hidden semi-Markov model-based times series data mining. Expert Syst Appl 36(5):9046–9055
Dragomir EG (2010) Air quality index prediction using K-nearest neighbor technique. Bull PG Univ Ploiesti Ser Math Inform Phys LXII 1(2010):103–108
Elangasinghe MA, Singhal N, Dirks KN, Salmond JA, Samarasinghe S (2014) Complex time series analysis of \(\text{ PM}_{10}\) and \(\text{ PM}_{2.5}\) for a coastal site using artificial neural network modelling and k-means clustering. Atmos Environ 94:106–116
Freeman BS, Taylor G, Gharabaghi B, Thé J (2018) Forecasting air quality time series using deep learning. J Air Waste Manag Ass 68(8):866–886
Gocheva-Ilieva SG, Ivanov AV, Voynikova DS, Boyadzhiev DT (2014) Time series analysis and forecasting for air pollution in small urban area: an SARIMA and factor analysis approach. Stoch Environ Res Risk Assess 28(4):1045–1060
Gómez-Carracedo MP, Andrade JM, López-Mahía P, Muniategui S, Prada D (2014) A practical comparison of single and multiple imputation methods to handle complex missing data in air quality datasets. Chem Intell Lab Syst 134:23–33
Haider SA, Naqvi SR, Akram T, Umar GA, Shahzad A, Sial MR, Khaliq S, Kamran M (2019) LSTM neural network based forecasting model for wheat production in Pakistan. Agronomy 9(2):72
Huang SF, Cheng CH (2008, July). Forecasting the air quality using OWA based time series model. In 2008 International Conference on Machine Learning and Cybernetics, IEEE. Vol. 6, pp.3254–3259
Khaldi R, Chiheb R, El Afia A (2018, May). Feedforward and recurrent neural networks for time series forecasting: comparative study. In Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and Applications pp. 1-6
Kumar U, Jain VK (2010) ARIMA forecasting of ambient air pollutants (\(\text{ O}_{3}\), NO, \(\text{ NO}_{2}\) and CO). Stoch Environ Res Risk Assess 24(5):751–760
Lin K, Jing L, Wang M, Qiu M (2017, August). A novel long-term air quality forecasting algorithm based on kNN and NARX. In 2017 12th International Conference on Computer Science and Education,IEEE. (ICCSE) pp.343–348
Li X, Peng L, Yao X, Cui S, Hu Y, You C, Chi T (2017) Long short-term memory neural network for air pollutant concentration predictions: method development and evaluation. Environ pollut 231:997–1004
Mahajan S, Chen LJ, Tsai TC, 2017, August. An empirical study of \(\text{ PM}_{2.5}\) forecasting using neural network. In, (2017) IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications. Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), IEEE, pp 1–7
Meryem O, Ismail J (2014, October). A comparative study of predictive algorithms for time series forecasting. In 2014 third IEEE international colloquium in information science and technology (CIST), IEEE, pp.68–73
Momani PENM, Naill PE (2009) Time series analysis model for rainfall data in Jordan: case study for using time series analysis. Am J Environ Sci 5(5):599
Myung IJ, Pitt MA (1997) Applying Occam’s razor in modeling cognition: a Bayesian approach. Psychon Bull Review 4(1):79–95
Noor MN, Yahaya AS, Ramli NA, Al Bakri AMM (2014) Mean imputation techniques for filling the missing observations in air pollution dataset. Trans Tech Publications Ltd., Kapellweg
Nury AH, Hasan K, Alam MJB (2017) Comparative study of wavelet-ARIMA and wavelet-ANN models for temperature time series data in northeastern Bangladesh. J King Saud Univ Sci 29(1):47–61
Oprea M, Mihalache SF, Popescu M (2016, May). A comparative study of computational intelligence techniques applied to \(\text{ PM}_{2.5}\) air pollution forecasting. In 2016 6th International Conference on Computers Communications and Control (ICCCC), IEEE. pp. 103–108
Qiao W, Tian W, Tian Y, Yang Q, Wang Y, Zhang J (2019) The forecasting of \(\text{ PM}_{2.5}\) using a hybrid model based on wavelet transform and an improved deep learning algorithm. IEEE Access 7:142814–142825
Qin D, Yu J, Zou G, Yong R, Zhao Q, Zhang B (2019) A novel combined prediction scheme based on CNN and LSTM for urban \(\text{ PM}_{2.5}\) concentration. IEEE Access 7:20050–20059
Reddy V, Yedavalli P, Mohanty S, Nakhat U (2018). Deep air: Forecasting air pollution in beijing, china
Roy S, Biswas SP, Mahata S, Bose R (2018, October). Time series forecasting using exponential smoothing to predict the major atmospheric pollutants. In 2018 International Conference on Advances in Computing, Communication Control and Networking (ICACCCN), IEEE. pp.679-684
Unnikrishnan R, Madhu G (2019) Comparative study on the effects of meteorological and pollutant parameters on ANN modelling for prediction of \(\text{ SO}_{2}\). SN Appl Sci 1(11):1394
Ventura LMB, de Oliveira Pinto F, Soares LM, Luna AS, Gioda A (2019) Forecast of daily \(\text{ PM}_{2.5}\) concentrations applying artificial neural networks and Holt-Winters models. Air Qual Atmos Health 12(3):317–325
Wang P, Zhang H, Qin Z, Zhang G (2017) A novel hybrid-Garch model based on ARIMA and SVM for \(\text{ PM}_{2.5}\) concentrations forecasting. Atmos Pollut Res 8(5):850–860
Wu L, Gao X, Xiao Y, Liu S, Yang Y (2017) Using grey Holt-Winters model to predict the air quality index for cities in China. Nat Hazards 88(2):1003–1012
Yang L, Li C, Tang X (2020) The impact of \(\text{PM}_{2.5}\) on the host defense of respiratory system. Front Cell Dev Biol, 8, p.91
Ye L, Yang G, Van Ranst E, Tang H (2013) Time-series modeling and prediction of global monthly absolute temperature for environmental decision making. Adv Atmos Sci 30(2):382–396
Zainuddin Z, Pauline O (2011) Modified wavelet neural network in function approximation and its application in prediction of time-series pollution data. Appl Soft Comp 11(8):4866–4874
Zakaria NN, Othman M, Sokkalingam R, Daud H, Abdullah L, Abdul Kadir E (2019) Markov chain model development for forecasting air pollution index of Miri, Sarawak. Sustainability 11(19):5190
Zhang G, Patuwo BE, Hu MY (1998) Forecasting with artificial neural networks: the state of the art. Int J Forecast 14(1):35–62
Acknowledgements
This research work is supported by the project entitled- Participatory and Realtime Pollution Monitoring System For Smart City, funded by Higher Education, Science & Technology and Biotechnology, Department of Science & Technology, Government of West Bengal, India.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Das, R., Middya, A.I. & Roy, S. High granular and short term time series forecasting of \(\hbox {PM}_{2.5}\) air pollutant - a comparative review. Artif Intell Rev 55, 1253–1287 (2022). https://doi.org/10.1007/s10462-021-09991-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-021-09991-1